Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make LLMModelFactory and VLMModelFactory inits public #226

Merged
merged 8 commits into from
Mar 8, 2025

Conversation

ibrahimcetin
Copy link
Contributor

This PR makes LLMModelFactory and VLMModelFactory inits public.

The PR currently not ready and I want to discuss some cases:

  1. I created shared instance on registry types. Is it a good approach? First, I thought init() should create default registry but I think it isn't a good approach because people going to expect an empty registry.
  2. Currently, there are two ModelRegistry class which is same. I think we should merge these two to one on MLXLMCommon package. Maybe, we can make it as a base class and create LLMRegistry and VLMRegistry class separately as a subclass of ModelRegistry. (Same for ModelTypeRegistry)
  3. I didn't make the changes but we may use String(describing:) for ProcessorTypeRegistry registration. For example,
public func registerProcessorType<T>(
        _ type: T.Type,
        creator: @Sendable @escaping (
            URL,
            any Tokenizer
        ) throws -> any UserInputProcessor
    ) {
        let typeName = String(describing: type)
        lock.withLock {
            creators[typeName] = creator
        }
    }

This is just for showing but if we use String(describing:) it will prevent typos and make the library more type safe IMO.

In conclusion, I want to be sure is this approach is acceptable? If so, I will do the same for the remaining parts.


instance.registerProcessorType("PaliGemmaProcessor", creator: create(PaliGemmaProcessorConfiguration.self, PaligGemmaProcessor.init))
instance.registerProcessorType("Qwen2VLProcessor", creator: create(Qwen2VLProcessorConfiguration.self, Qwen2VLProcessor.init))
instance.registerProcessorType("Idefics3Processor", creator: create(Idefics3ProcessorConfiguration.self, Idefics3Processor.init))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me and I agree with your thinking in the description. Two thoughts on improvements:

  • this could look more like a table (like the deleted code below) and I think it will be more obvious for people to edit
  • perhaps the table should be extracted as a private let at the top level -- we can put all the tables at the top of the file so it is even more obvious what people need to edit to register these built in models

I wonder if we would want a way to copy shared in case you wanted a private registry but starting with the default types? Or perhaps you just use shared in that case. The use case I am imagining is you have a chat application that can use any of these shared models but you also have a private model in your app. In that case you want a merge and updating the global would be fine.

If you provided a framework / swiftpm library and wanted to register your types perhaps you would have a static method that took a VLMModelFactory and updated the registries on that. That would allow you to add to a pre-populated registry or an empty one.

So I think we don't need a way to copy() a registry -- if something comes up later we can revisit.

See what you think about moving the default registry to a table at the top of the file.

@davidkoski
Copy link
Collaborator

  1. I created shared instance on registry types. Is it a good approach? First, I thought init() should create default registry but I think it isn't a good approach because people going to expect an empty registry.

See comment in the code -- I think shared is reasonable and it lets people construct different variations.

  1. Currently, there are two ModelRegistry class which is same. I think we should merge these two to one on MLXLMCommon package. Maybe, we can make it as a base class and create LLMRegistry and VLMRegistry class separately as a subclass of ModelRegistry. (Same for ModelTypeRegistry)

A base class makes sense, something like AbstractModelRegistry and the VLM/LLM variants just subclass that. I think the piece that blocked this in the past was how to handle the default registry, but a shared static might deal with that.

Right now the ModelRegistry is a place where the static model configurations live, e.g. qwen2VL2BInstruct4Bit and I think that is useful.

  1. I didn't make the changes but we may use String(describing:) for ProcessorTypeRegistry registration. For example,
public func registerProcessorType<T>(
        _ type: T.Type,
        creator: @Sendable @escaping (
            URL,
            any Tokenizer
        ) throws -> any UserInputProcessor

So this would be called like this:

registry.registerProcessorType(PaligGemmaProcessor.self) { url, tokenizer in
    let config = try JSONDecoder().decode(PaliGemmaProcessorConfiguration.self, from: Data(contentsOf: url))
    return PaligGemmaProcessor(configuration, tokenizer: tokenizer)
}

it looks like String(describing:) gives just the type name without the module (good in this case):

  2> import Foundation
  3> Date.self
$R1: Foundation.Date.Type = Foundation.Date
  4> String(describing: Date.self)
$R2: String = "Date"

A few concerns:

  • I think we still need the String version of this call to handle cases where the type and the string in the config don't match. I don't know for certain this will happen as the string seems to be a python type name, but I think we should be prepared for it
  • there isn't a tie between the type passed and the actual processor that is created -- perhaps we can do that with an adjustment to the signature (below)
public func registerProcessorType<T>(
        _ type: T.Type,
        creator: @Sendable @escaping (
            URL,
            any Tokenizer
        ) throws -> T

If we can enforce the type name and the type matching like that I think this looks valuable.

@ibrahimcetin
Copy link
Contributor Author

Right now the ModelRegistry is a place where the static model configurations live, e.g. qwen2VL2BInstruct4Bit and I think that is useful.

I agree. To clarify my thinking, I would like to add an example code here:

While we add new features on ModelRegistry, we have to duplicate them in the current implementation. But if we move it in MLXLMCommon, we may define LLMRegistry and VLMRegistry on MLXLLM and MLXVLM respectively.

// in MLXLMCommon
public class ModelRegistry: @unchecked Sendable {
    /// Creates an empty registry.
    public init() {
        registry = Dictionary()
    }

    /// Creates a new registry with from given model configurations.
    public init(modelConfigurations: [ModelConfiguration]) {
        registry = Dictionary(uniqueKeysWithValues: modelConfigurations.map { ($0.name, $0) })
    }

    private let lock = NSLock()
    private var registry: Dictionary<String, ModelConfiguration>

    public func register(configurations: [ModelConfiguration]) {
        lock.withLock {
            for c in configurations {
                registry[c.name] = c
            }
        }
    }

    public func configuration(id: String) -> ModelConfiguration {
        lock.withLock {
            if let c = registry[id] {
                return c
            } else {
                return ModelConfiguration(id: id)
            }
        }
    }

    public var models: some Collection<ModelConfiguration> & Sendable {
        lock.withLock {
            return registry.values
        }
    }
}

// in MLXVLM
class VLMRegistry: ModelRegistry {
    /// Shared instance with default model configurations.
    public static let shared = ModelRegistry(modelConfigurations: all())

     static private func all() -> [ModelConfiguration] {
        [
            paligemma3bMix448_8bit,
            qwen2VL2BInstruct4Bit,
        ]
    }

    static public let paligemma3bMix448_8bit = ModelConfiguration(
        id: "mlx-community/paligemma-3b-mix-448-8bit",
        defaultPrompt: "Describe the image in English"
    )

    static public let qwen2VL2BInstruct4Bit = ModelConfiguration(
        id: "mlx-community/Qwen2-VL-2B-Instruct-4bit",
        defaultPrompt: "Describe the image in English"
    )

    static public let smolvlminstruct4bit = ModelConfiguration(
        id: "mlx-community/SmolVLM-Instruct-4bit",
        defaultPrompt: "Describe the image in English"
    )
}

// in MLXLLM
class LLMRegistry: ModelRegistry {
    // Same as MLXVLM
}

This is just to avoid code duplication. What do you think?

@davidkoski
Copy link
Collaborator

This is just to avoid code duplication. What do you think?

Yes, looks great!

@ibrahimcetin
Copy link
Contributor Author

@davidkoski Thanks for the feedback. I want to limit this PR with these changes and it is ready now. I will move ModelRegistry and ModelTypeRegistry to MLXLMCommon after this PR is merged. Then, I will add contains(id:) method on ModelRegistry as discussed in #224.

@davidkoski
Copy link
Collaborator

CI failed on the swift-format check -- can you please run:

@ibrahimcetin
Copy link
Contributor Author

@davidkoski Done

Copy link
Collaborator

@davidkoski davidkoski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the additions!

@davidkoski davidkoski merged commit 3885b92 into ml-explore:main Mar 8, 2025
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants