Hugging Face has significantly contributed to the breakthrough of machine learning application technology, especially in the NLP field. They could contribute a lot because Hugging Face focuses on building a platform for the community to easily access models, tools, and datasets to the public. That’s why Hugging Face has become a place to contribute to and showcase many machine learning works.
As the Hugging Face platform is essential for world technology breakthroughs, we should understand more about the Hugging Face Hub platform. This article will focus more on the Model Hub and the Community, where most work happens.
Let’s jump into it.
Hugging Face Hub Platform in General
As mentioned above, Hugging Face is a company focused on machine learning development by building a platform that allows easy access for sharing and contributing to the community. The platform is called Hugging Face Hub.
Hugging Face Hub is a platform that hosts publicly available and open-source models, datasets, and apps. The community can easily access everything in the Hub individually or in collaboration. The Hub’s structure is shown below.
Let’s explore how the Hugging Face Model Hub works.
Hugging Face Model Hub
The Hugging Face Hub model is where the model is hosted for various machine learning tasks, such as image classification, question-answering, text-to-speech, and many more. The community can use the model hub to share and discover any valuable model for downstream tasks.
Let’s break down the key elements of the Model Hub.
Model repository
The first thing we explore is the Model Repository. The Model repository is similar to the GitHub repository, where users can upload and share models with the public. You can see an example of the Model repository in the image below.
Each repository could store the models we have trained for specific tasks. Users can upload the model in formats like PyTorch, TensorFlow, or JAX. Let’s select one of the repositories and see what is inside.
The image above is the repository for the Mistral-Nemo-Instruct-2407 pre-trained model. It contains many pieces of information, including the Model Card, Files and Versions, and Community. Let’s understand for each one.
The Model Card contains comprehensive information about the model. You can fill it with information, including model architecture, training data, usage instructions, and model performance.
The repository also supports file storage, including the configuration and model weights. It comes with version control for the model, which allows the user to track any changes each time we update it and revert to an older version if necessary.
In each repository, you can create a discussion thread and new pull requests to collaborate with the community. This feature is similar to the GitHub repository if you are familiar with it.
Lastly, the Model Repository has the downloaded statistical information, training code template, deployment code template, and inferential API for testing the model. Additionally, there is information on which Hugging Face space uses the model.
Model Hub Search
With so many repositories within the Hugging Face Model Hub, it’s hard to find the models we need if we randomly select them individually. That’s why there is a search bar.
You can search the model using keywords related to your requirements, such as the machine learning tasks or the model framework. Ideally, each repository is tagged with relevant descriptions, such as the tasks, framework, dataset used, etc. If the repository is tagged correctly, you can also use it to filter by selecting the tabs (Tasks, Libraries, Datasets, Languages, Licenses, Other).
For a more broad search, you can use the full-text search.
You can look for the model repositories containing the text you input using the full-text search.
Model Hub Integration
Once you select the model you want to use in the Hugging Face Model Hub, you can download all the files manually, or we can use the huggingface_hub
library to interact with the repository. Let’s try them out.
First, we need to install the library.
pip install huggingface_hub |
Next, you must acquire the user access token from your settings page.
from huggingface_hub import login login(‘USER_ACCESS_TOKEN’) |
After setting it, we can interact with the Model Hub repositories. For example, I would try to download a file from Model Repositories.
from huggingface_hub import hf_hub_download hf_hub_download(repo_id=“Groq/Llama-3-Groq-70B-Tool-Use”, filename=“config.json”) |
It’s also possible to download a specific file with a particular version.
from huggingface_hub import hf_hub_download hf_hub_download(repo_id=“Groq/Llama-3-Groq-70B-Tool-Use”, filename=“config.json”, version = “80ef0bf2502c651f45a93fceea7376899f284872”) |
You can also create your model repository and upload your files, such as config files or Model.
from huggingface_hub import HfApi api = HfApi() api.create_repo(repo_id=“my_model”) |
Then, we can upload the files to your repositories using the following code:
api.upload_file( path_or_fileobj=“path/to/your/file/README.md”, path_in_repo=“README.md”, repo_id=“cornelliusyudhawijaya/my_model”) |
For further usage of the Model from the Model Hub, you can check out the Transformers documentation from Hugging Face.
That’s all the basics for the Model Hub. Let’s move on to the Hugging Face Community.
Hugging Face Community
I have mentioned a community in the Model Repository where you can discuss the model and create a pull request. However, the Hugging Face community is more than that. If you look at the drop-down on the Hugging Face website, it contains many community pages similar to the image below.
Let’s try to break them down to understand better.
Hugging Face Community Blog Articles
The name is self-explanatory, as the Community Blog Articles section contains blog post and articles published from the community to the community. You can create your article, but must subscribe to the Hugging Face Pro subscription.
You can read many blogs and articles here and filter them by tag. Try to explore them, as you can learn a lot from community blogs and articles.
Hugging Face Community Learn
Speaking of learning, Hugging Face also provides the community with various topics, such as NLP, Computer Vision, and many more.
Each course is self-learning without any time limit, and you can always revisit it in any way you want. The course example is in the image below.
Try to use this platform as learning material, as it will be helpful for your future career.
Hugging Face Community Forum
You can join the Hugging Face Forum if you are into old-style discussions. In this forum, you can create topics and answer questions from the community. The forum is similar to the image below.
You can also filter the categories to suit your needs. The forum is a great place to discuss with the community if you need more detailed answers.
Hugging Face Community Discord
If you prefer direct live chat, you can do that via Hugging Face Community Discord. After accepting the invite, you will be directed to the Community Discord.
You need to visit four different starter channels and verify your username using the Hugging Face User Access Token you previously created. Then, go to the LevelBot private message and chat with the bot using the following command.
Once verified, you will get the message and be free to explore the community.
Hugging Face Community GitHub
Hugging Face is often dubbed the GitHub of Machine Learning, but it still hosts many of its libraries, utilities, learning materials, and other resources in GitHub repositories.
As of the time of this article, there are around 235 repositories you can explore and discuss with the community. You can also contribute to the open-source project hosted in the Hugging Face GitHub.
Conclusion
Hugging Face is an important platform for machine learning development. It allows the community to easily access models, datasets, applications, and more in an open-source manner.
This article focuses on exploring the Hugging Face Model Hub and Community. These two features make the Hugging Face renowned, and you should also know about them.