Apple has shared technical details about the company’s on-device and server foundation models to power its new Apple Intelligence system.
Apple Intelligence, announced at the 2024 Worldwide Developers Conference (WWDC) will be integrated into iOS 18, iPadOS 18, and macOS Sequoia. These AI capabilities are designed to assist users with daily tasks and learn from their activities. Importantly, Apple’s AI training utilizes licensed data, public data from its AppleBot web crawler, as well as human-labeled and synthetic data while explicitly excluding private user information.
Apple is also collaborating with OpenAI to integrate ChatGPT into iOS and Siri. This partnership aims to incorporate OpenAI’s advanced multimodal models, including GPT-4o, for more complex tasks, though specific details about this integration remain unclear.
Specialized AI Models and Cloud Infrastructure
Apple Intelligence includes several generative models aimed at enhancing a variety of user interactions. These models provide assistance in text composition, notification management, image creation in conversations, and in-app functions to simplify user actions. The suite features a 3 billion parameter on-device language model and a larger server-based model that operates through what Apple calls Private Cloud Compute on Apple silicon servers. The customized server hardware includes security features like the Secure Enclave and Secure Boot, mirroring the protections found in iPhone hardware. This design ensures both high performance and a fortified security layer.
The newly unveiled servers operate on an exclusive OS, which Apple describes as a fortified subset derived from iOS and macOS foundations. This operating system is optimized for Large Language Model (LLM) inference tasks and reduces traditional attack surfaces. Unlike conventional datacenter setups, it omits remote administration tools, focusing instead on limited operational metrics to ensure personal data remains protected.
Although specific CPU details were not disclosed, the inclusion of technologies such as Secure Enclave and Secure Boot suggests parallels with the A-series chips in iPhones. These chips are known for their 16-core neural engine capabilities, a feature also highlighted in the M4 processor recently used in the iPad Pro. The introduction of custom silicon for these servers demonstrates Apple’s capacity for advanced chip design aimed at high-security environments.
Apple Intelligence includes a coding model designed for Xcode and a diffusion model for visual enhancements in the Messages app. The on-device model reportedly surpasses prominent models like Phi-3-mini, Mistral-7B, and Gemma-7B, while the server-based model is competitive with DBRX-Instruct, Mixtral-8x22B, and GPT-3.5-Turbo, though it lags behind GPT-4.
Techniques for Model Optimization and Adaptation
Apple uses several methodologies to boost the efficiency and performance of its models, such as grouped-query-attention and low-bit palletization for on-device inference. To minimize memory use and reduce costs, the models employ shared vocab embedding tables, holding a vocab size of 49K for the on-device model and 100K for the server model.
Fine-tuning is achieved through adapters—specialized neural network modules that allow the model to adapt dynamically to specific tasks while maintaining a general knowledge base. This process facilitates the on-device model’s capabilities in tasks like email and notification summarization, enhancing user satisfaction by around ten percent.
Performance and Evaluation Metrics
Apple evaluates its AI models using a combination of benchmarks and human assessments, focusing on general capabilities and specific task performance. The models undergo comparisons with open-source and commercial counterparts of similar sizes, showing superior outcomes in instruction-following, summarization, and safety benchmarks. Apple says that human evaluators consistently rate its models higher for helpfulness and safety.
The company continues to perform manual and automatic red-teaming to ensure continuous improvements in model safety. However, Apple’s internal metrics indicate that while efficient, their models do not quite match the leading performance of top-tier models like GPT-4.
Principles of Responsible AI Development
Apple emphasizes its dedication to responsible AI practices, adhering to principles that focus on user empowerment, global representation, careful design to avoid misuse, and safeguarding user privacy. The company guarantees that user data remains excluded from its model training process, and employs filters to eliminate personally identifiable information and low-quality content. The models are trained using Apple’s AXLearn framework, employing techniques like data parallelism to enhance efficiency and scalability.
Last Updated on November 7, 2024 7:44 pm CET