Meta has been utilizing user data to train its large language model (LLM) named Llama 2. This data-driven approach aids in the development of Meta's forthcoming AI chatbot. The data used for this purpose is sourced from publicly available information, as well as from Meta's products and services, including Facebook, Instagram, and the new Threads service. However, users have the option to limit the extent to which their data is used for this purpose.
Opting Out and Data Management
Users who wish to have more control over how their data is used by Meta have a couple of options. One of these is the Off-Facebook Activity (OFA) tools, which allow users to see a significant portion of the data that Facebook and its affiliates have collected about them. Through the OFA tools, users can manage their Off-Facebook Activity, turn off future activity, and clear their history.
Additionally, Meta has introduced the “Generative AI Data Subject Rights” form on Facebook. This form allows users to submit requests related to their third-party information being used for generative AI model training. Users can access, download, correct, or delete any personal information used for generative AI through this form.
Official Statements and Policies
According to the official page about generative AI on the Meta Privacy Center, generative AI enables the creation of content in innovative ways. These AI models, trained on vast amounts of data, can generate content like text and images. The training data comes from a mix of publicly available online information, licensed data, and information from Meta's products and services.
Meta emphasizes its commitment to privacy and has a robust internal Privacy Review process in place. This process ensures responsible data usage for all products, including generative AI. The company also provides a platform for users to raise concerns or objections related to their third-party information being used for generative AI model training.
Implications, Openness, and Future Directions
Meta's push into AI development has been evident with the release of models like LLaMA and SeamlessM4T, which offers speech and text translations in up to 100 languages. While the company sees immense potential in generative AI for creators and businesses globally, it also acknowledges the need for transparency and user control as the technology evolves.
One of the tenants of Meta's AI push is openness, both for users and in terms of access. Giving users the right to opt out of sharing information is an important step towards democratizing AI. However, there may be concerns that Meta's move is a gesture of goodwill that will not be upheld.
Of course, Meta has a long history of mishandling user data and privacy when it was Facebook. Hiding behind a new branding is one thing, but users will hope that the company is serious about protecting their rights.
Meta has also been promoting its AI as open, giving developers access to create their own solutions. However, “open” is a relative term and it seems Meta's definition is not complete. For example, the company readily admits that some data is purposely held back. There was also a recent study that found Meta and OpenAI ChatGPT are not as open as they claim.
The study, done by AI experts from Radboud University in Nijmegen, Netherlands, shows that some of the strongest AI LLMs are hidden from the public, because the code that trained them is not shared.
The study names OpenAI and Meta as the most closed LLM makers and says that this hurts the AI community. It asks for more honesty and openness from companies, so that others can learn from their work and make it better.