Collecting user data to train AI before triggering regulation, Meta pauses action
云海听松
发表于 2024-6-20 14:15:31
1324
0
0
21st Century Business Herald reporter Xiao Xiao reports from Beijing
This week, Meta announced the suspension of using data from EU and UK users to train AI, and postponed the launch of its own large model in Europe.
Ireland, the UK, Norway and other regulatory agencies have claimed it, and the company's move is in response to regulatory requirements. The Norwegian data protection agency stated that Meta has promised to suspend the use of posts and images on Facebook and Instagram to train large models, and it is currently uncertain how long it will be delayed. Discussions are underway with regulatory agencies in other EU countries.
Meta's plan to collect user data began last month, and the platform notified European users that it will officially launch a new privacy policy by the end of June: the company will use public content on Facebook and Instagram to train the big model, including interactive content, status, photos, and titles, excluding private chat records and minor account information. The updated privacy policy has sparked opposition, and the Austrian non-profit organization NOYB immediately filed complaints to 11 EU member states, requesting the initiation of emergency procedures.
The controversy is not unique. How to train AI through data authorization from users is a difficult problem for all Internet companies. Companies should not only grasp the compliance criteria, but also take into account the increasingly sensitive user emotions to privacy issues. The interviewed experts told 21st Century Business Herald that citing the EU's "legitimate interests" clause to obtain user data may become increasingly common in the future. However, currently, China's Personal Information Protection Law does not directly establish similar provisions, and domestic enterprises need to pay special attention to obtaining the explicit consent of users.
The "legitimate interests" clause may become a familiar face
In the complaint against Meta, NOYB identified two non compliances:
The first reason is that Meta's description of artificial intelligence is too broad, without specifying the purpose of collecting and processing user information. Meta's privacy policy only uses the term "artificial intelligence technology", which NOYB founder Max Schrems believes is equivalent to saying "we will use data in the database.".
"Meta did not specify what it would use this data for, nor did it set any restrictions. Artificial intelligence technology may refer to a simple chatbot, highly aggressive personalized advertising, or even lethal drone weapons." Max Schrems explained.
The second reason is that the user defaults to agreeing to collect data, and the rejection process is complex. Taking Facebook as an example, if users want to refuse platform collection of their data, they need to go through settings and privacy - Privacy Center - Generative AI - More Information - "Meta How to Train Big Models with Data" five level page, in order to find an opposition form at the end of the file. And only by actively filling out the form and passing it through the company can users refuse data collection.
Meta argues that the large model needs to reflect the diversity of language, geography, and cultural backgrounds of the European people, so the data collected by company users should belong to the "legitimate interests" stipulated in the General Data Protection Regulations, without the need for special user consent.
Generally speaking, the General Data Protection Regulations assume that collecting personal information is illegal, but the "legitimate interests" clause exempts some situations where data collection is necessary and does not require user consent. Such legal collection behavior can be for personal, commercial, or public interests.
"The industry generally believes that the EU has strict restrictions on personal information processing, but in fact, it leaves some room for interpretation through legitimate interest clauses." Wang Xinrui, a partner at Shihui Law Firm, has been engaged in data compliance business for many years. Wang Xinrui told 21st Century Business Herald that the setting of legitimate interest clauses is complex and flexible, and requires a series of tests. It can be said that it is a legal foundation with a large explanatory space.
Previously, Meta had also cited legitimate interests, defending the act of collecting user data to place personalized advertisements. However, the European Court of Justice ultimately refuted this claim, and Max Schrems therefore believed that legitimate interests were also difficult to apply to data capture and use in training AI. Wang Xinrui stated that for some emerging technology scenarios, other legal foundations may be difficult to establish, but there is still some room for interpretation of legitimate interests. Therefore, Meta will try to cite it, estimating that "this clause will repeatedly appear in various AI related cases in the future."
It should be noted that unlike the European Union, China's personal insurance law does not directly include "legitimate interests" in the statutory exemption situation. However, Wang Xinrui pointed out that some typical situations stipulated in the EU's General Data Protection Regulations are also covered by other provisions in China.
Lawyer Cheng Nian from Zhejiang Kenting (Beijing) Law Firm told 21st Century Business Herald that similar regulations in China include limited situations: one is sudden health emergencies or emergency situations to protect natural persons, and the other is legally confidential actions, such as collecting data without obtaining user consent due to the epidemic or anti-terrorism investigations by public security agencies, and business operations are usually difficult to fall within this scope.
User data becomes an industry sensitive point
"We are very disappointed." "This is a setback for European innovation and artificial intelligence development competition, and further delays the benefits that artificial intelligence brings to the European people." Meta complained in her blog that she is actually following the industry's approach - Google and OpenAI have already used European user data to train AI, and "compared to peers, our data collection methods are more transparent." "
However, it seems that this is not the case, and caution towards user data has gradually developed into a consensus approach. For example, ChatGPT was the first to allow users to refuse their personal data from being taken for training by the official by turning off the chat recording function, although this inevitably affects the quality of the large model's answers; On June 19th, Adobe specifically updated its service terms, explicitly stating that Adobe's software will not use the user's local or cloud content to train generative AI models.
Last year, the domestic office software WPS attempted to add a new privacy policy: "We will use the document materials you voluntarily upload as the basic materials for AI training after desensitization treatment." After being discovered by users, it triggered a collective boycott. WPS apologized to users and promised that user documents will not be used for AI training.
At present, technology giants that clearly collect user data to train AI include Google and X: in order to launch Musk's x AI company X updated its privacy policy in September last year, which stated in Regulation 2.1: "We may use collected and publicly available information to help train our machine learning or artificial intelligence models."; Last July, Google's privacy policy also added a new clause, "We may collect publicly available online information or information from other public sources to help train Google's artificial intelligence models."
However, at that time, Deng Zhisong, senior partner of Beijing Dacheng Law Firm, told 21st Century Business Herald that Google had provided a detailed explanation of the scope and purpose of collecting and processing user personal information. Even with the stricter "inform agree" rules under the EU GDPR as the standard, Google's approach was at least formally compliant.
NOYB also pointed out that Meta hopes to collect all public and non-public personal information since 2007, covering the interaction traces on Facebook and Instagram social media, which is different from the general approach of AI companies to disclose information via the Internet.
How to meet compliance requirements and develop technology while respecting user rights? Wang Xinrui emphasized to 21st Century Business Herald that for domestic companies, if they want to collect user data to train AI, they need to comply with the "Interim Measures for the Management of Generative Artificial Intelligence Services", which clearly stipulates that if personal information is involved, they should obtain personal consent or comply with the law. That is to say, special attention needs to be paid to whether the user has been clearly informed and their consent has been obtained before collecting and using their personal information. If the user's consent is not obtained in advance, there should be legal obligations, public interests, and other legal foundations, otherwise there are corresponding compliance risks.
Cheng Nian added that personal information collected and obtained based on user use of the product requires explicit consent, and sensitive information also requires separate consent; In addition, it is necessary to ensure that users can easily access, correct, delete personal information, and withdraw their consent, especially by providing them with the option to refuse to collect data for AI training, ensuring their right to know and choice.
CandyLake.com 系信息发布平台,仅提供信息存储空间服务。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
声明:该文观点仅代表作者本人,本文不代表CandyLake.com立场,且不构成建议,请谨慎对待。
猜你喜欢
- Meta fined over $15 million by South Korea for collecting user data
- Meta证实向特朗普就职基金捐款100万美元
- Meta confirms donation of $1 million to Trump's inauguration fund
- Meta、トランプ氏就任基金への100万ドル寄付を確認
- 메타, 트럼프 취임기금에 100만 달러 기부 확인
- Meta与亚马逊相继确认向特朗普就职基金捐款100万美元
- Meta and Amazon confirm donation of $1 million to Trump's inauguration fund
- Metaとアマゾン、トランプ氏就任基金への100万ドル寄付を相次いで確認
- 메타와 아마존, 트럼프 취임기금에 100만 달러 기부 잇따라 확인
- One week outlook | Led by the Federal Reserve, the last "central bank super week" of the year is coming with a heavy blow; China's November economic data and important inflation data from the United States are about to be released
-
隔夜株式市場 世界の主要指数は金曜日に多くが下落し、最新のインフレデータが減速の兆しを示したおかげで、米株3大指数は大幅に回復し、いずれも1%超上昇した。 金曜日に発表されたデータによると、米国の11月のPC ...
- SNT
- 前天 12:48
- 支持
- 反对
- 回复
- 收藏
-
長年にわたって、昔の消金大手の捷信消金の再編がようやく地に着いた。 天津銀行の発表によると、同行は京東傘下の2社、対外貿易信託などと捷信消金再編に参加する。再編が完了すると、京東の持ち株比率は65%に達し ...
- SNT
- 前天 12:09
- 支持
- 反对
- 回复
- 收藏
-
【GPT-5屋台で大きな問題:数億ドルを燃やした後、OpenAIは牛が吹くのが早いことを発見した】OpenAIのGPT-5プロジェクト(Orion)はすでに18カ月を超える準備をしており、関係者によると、このプロジェクトは現在進 ...
- SNT
- 4 小时前
- 支持
- 反对
- 回复
- 收藏
-
【ビットコインが飛び込む!32万人超の爆倉】データによると、過去24時間で世界には32万7000人以上の爆倉があり、爆倉の総額は10億ドルを超えた。
- 断翅小蝶腥
- 3 天前
- 支持
- 反对
- 回复
- 收藏