Apple researchers recently published a study regarding generative artificial intelligence. It focuses on a new approach to training methods for large language models (LLM). As Venture Beat reports, it could bea significant advance in the field.
Careful combinations
Training methods developed by Apple researchers could pave the way for more efficient and versatile artificial intelligence systems. To achieve this, the idea is to train the LLMs usingcombinations of several elements. In the study report, an Apple researcher explains:
We found that to effectively train large-scale multimodal models, it is crucial to use a variety of data, including image captions, text associated with images, as well as text data alone.
The exploitation of combinations of several elements would have made it possible tosurpass researchers' expectationson several points. We are talking in particular about the description of an image, answers to questions based on an image, or even the understanding of natural language.
Adapting and scaling visual components
The Cupertino researchers also learned from their testing that choosing the right technologies to process visual elements was critical. In the study, it is mentioned:
We show that the image encoder as well as image resolution and number of image tokens have a substantial impact, while the design of the vision-language connector is of relatively negligible importance.
The encoder transforms the images into data that the computer can understand and is an important factor in the performance of the model, as is the resolution of the image. As for the number of tokens, these are units of data which allow the power of the LLM to be more or less exploited. The more tokens are allocated, the more accurately it can analyze the image.
Thus, the factors cited aboveare the most important to take into account, so much so that the design of the vision-language connector is not. The latter refers to the way a model combines visual information (what the image shows) with language (what the associated text says).
Apple is working a lot on artificial intelligence, but it could be that conclusive results for making it available to the general public are slow to arrive, as shown by the latest Bloomberg report. We learn thatApple reportedly in talks with Google to integrate Gemini into iOS, when we thought that the apple would offer its own tools.
Developing AI at the level of ChatGPT is a major challenge and takes time. To go faster,Apple buys many AI companies,since last year. This may be a good sign, but it could also indicatea difficulty in moving forward in the field given the significant number of acquisitions.
Looking at the history of OpenAI, we realize that they did not buy around thirty AI companies like Apple, but rather relied on their researchers to get where they are. Of course, this took a huge amount of time and hiccups.GPT 1 in 2018 had nothing to do with what is currently offered with GPT 4,taking into account that GPT 5 could be released around 2025, according to the latest official statements from OpenAI. However, maybe Apple will offer something of the same level in less time, time will tell.
i-nfo.fr - Official iPhon.fr app
By : Keleops AG