Search This Blog

Wednesday 6 March 2024

AI Risk Management: An Update

It's a while since I covered the legal aspects of AI here, but I've been posting on the topic fairly frequently on LinkedIn and more recently on Pragmatist. The widespread use of artificial intelligence (AI) - particularly generative AI - as well as the problems described below and the fact that you may not know you are relying on it, means you need to know how these technologies work (at least conceptually, if not in detail) and their impact. At scale, the harms from AI can arise before being detected, and a lot of AI has been launched as a ‘minimum viable product’ to suit the interests of developers over other stakeholders. But to avoid over-reacting, we need to be realistic about what AI can really achieve. To chart a safe route for the development and deployment of AI there’s a need prioritize the public interest, and align technology with widely shared human values rather than the self-interest of a few tech enthusiasts, no matter how wealthy they are. That means uniting the AI industry, researchers and civil society around the public perspective. In this respect AI should be treated like aviation, health and safety, and medicines. It seems unwise for the next generation of AI to launch into unregulated territory. If you would like advice on any aspects of this post, please let me know.

 
What is AI?

The term "AI" embraces a collection of technologies that involve ‘machine learning’ at some point:

  • artificial neural networks (ANN) – one ‘hidden’ layer of processing
  • deep learning networks (DNN) – multiple ‘hidden’ layers of processing
  • machine perception - the ability of processors to analyse data (whether as images, sound, text, unstructured data or any combination) to recognise/describe people, objects and actions.
  • automation
  • machine control – robotics, autonomous vehicles, aircraft and vessels
  • computer vision – image, object, activity and facial recognition
  • natural language processing - speech and acoustic recognition/response
  • personalisation
  • Big Data analytics
  • Internet of things (IoT)

While AI technologies themselves may be complex, the concepts are simple. Traditionally, we load a software application and data into a computer, and run the data through the application to produce a result/output. But machine learning involves feeding the data and desired outputs into one or more computers or computing networks that are designed to write the programme (e.g. you feed in data on crimes/criminals and the output of whether those people re-offended, with the object of producing a programme that will predict whether a given person will re-offend). In this sense, data is used to ‘train’ the computer to write and adapt the programme, which constitutes the "artificial intelligence".

So, in a traditional computing scenario you can more readily discover that the wrong result was caused by bad data but this may be impracticable with a single hidden layer of computing in an ANN, let alone in a DNN with its multiple hidden layers.

Generative AI tools are built using foundation models that are either single modal (receiving input and generating content using only text, for example) or multi-modal (able to deal with, text, audio and images and so on). A large language model (LLM) is a type of foundation model. As explained to the House of Lords' communications and digital select committee, LLMs are designed around probability and have nothing to do with ‘truth’. They learn patterns of language and generate from those learned patterns. So, a valid output for the AI may be obviously wrong to a human with more facts available. 

Various AI technologies are often used in conjunction  (e.g. scanning documents for hints of fraud, robotic process automation ("RPA") and personalising services for individuals or groups of customers); and may be combined with devices or other machines in the course of biometrics, robotics, the operation of autonomous vehicles, aircraft, vessels and the 'Internet of things.

AI is better than humans at some tasks (“narrow AI”) but “general AI” (same intelligence as humans) and “superintelligence” (better than humans at everything) are the stuff of science fiction.

What is AI used for?

AI is used for:

  1. Clustering: putting items of data into new groups (discovering patterns);
  2. Classifying: putting a new observation into pre-defined categories based on a set of 'training data'
  3. Predicting: assessing relationships among many factors to assess risk or potential relating to particular conditions (e.g. creditworthiness);
  4. Generating new content.

The Challenges with AI

There is a long list of concerns about AI, including:

  1. cost/benefit – it cost $50m in electricity to teach an AI to beat a human being at Go, hundreds of attempts to get a robot to do a backflip; and the power to generate a single AI image from text could charge an iPhone;
  2. dependence on training data licences, quantity, quality, timeliness and availability;
  3. lack of  understanding - an AI might predict 79% of European Court judgments doesn't know any law, it just counts how often words appear alone, in pairs or fours;
  4. inaccuracy - no AI is 100% accurate;
  5. Infringement of copyright, privacy, confidentiality, trade secrets etc. in the training data;
  6. Whether using AI can meet the test of “author’s own intellectual creation” to attract copyright protection;
  7. ‘hallucination’ by generative AIs (producing spontaneous errors or inaccurate responses (e.g. fictitious court citations or literary ‘quotes’ from bogus work);
  8. Deepfakes (deliberately created fake still and moving images and/or recordings)
  9. Making existing types of malicious activity easier;
  10. lack of explainability - machine learning involves the computer adapting the programme in response to data, and it might react differently to the same data added later, based on what it has 'learned' in the meantime; 
  11. Specific legal/ethical issues associated with specific AI technologies, such as the use of automated facial recognition by the police; and where liability falls given that the AI itself has no legal personality or status.
  12. Bias - the inability to remove both selection bias and prediction bias; 
  13. the challenges associated with the reliability of evidence and how to resolve disputes arising from its use - lawyers have not typically been engaged in AI development and deployment;
  14. There are concerns around the secondary impact of AI on employment and on other services that it might draw upon without refreshing or maintaining.
  15. AI systems may reveal training data and actual copyright material and privacy information under a ‘divergence attack’ or merely unusual requests that causes the AI to break its ‘alignment’ (e.g. asking ChatGPT 3.5 to repeat the word ‘poem'). 
  16. Some users complain that chatbots can be lazy, or fail to perform requested tasks without prompts (or maybe even at all). 

The House of Lords committee (like the FTC in the US) found that AI poses credible threats to public safety, societal values, copyright, privacy, open market competition and UK economic competitiveness.

LLMs may amplify any number of existing societal problems, including inequality, environmental harm, declining human agency and routes for redress, digital divides, loss of privacy, economic displacement, and growing concentrations of power.

LLMs might entrench discrimination (for example in recruitment practices, credit scoring or predictive policing); sway political opinion (if using a system to identify and rank news stories); or lead to casualties (if AI systematically misdiagnoses healthcare patients from minority groups).

Unacceptable Uses for AI

From all these challenges one can deduce and infer acceptable and unacceptable use-cases. For instance, it now seems obvious to use an AI system to trawl through a closed set of discovered documents and other data, seeking evidence on a certain issue.

An AI might be allowed to run in a fully automated way where commercial parties are able to knowingly accept a certain level of inaccuracy and bias and losses of a quantifiable scale (though we’ve seen disasters arise through algorithmic trading and where markets for some instruments suddenly grind to a halt through human distrust of the outputs).

But an AI should not be used to fully automate decisions that affect an individual’s fundamental rights and freedoms, grant benefits claims, approve loan applications, invest a person’s pension pot, individual pricing or predict, say, criminal conduct. It is also probably unacceptable to simply overlay a right to human intervention in such cases – or rely on human intervention by staff – since the Post Office/Horizon scandal has demonstrated that human intervention is no panacea! AI might be used to some degree in steps along the way to a decision, but the decision itself should be consciously human. In other words, a human should be able to explain why and how the decision was reached, the parameters and so on, to be able to re-take the decision if necessary.

The default position among many AI technologists is that AI development should free-ride on human creativity and personal data. This has implications for copyright, trade marks and privacy.

Copyright

OpenAI has admitted that their platforms would not exist without access to copyright materials:

 “Because copyright today covers virtually every sort of human expression – including blogposts, photographs, forum posts, scraps of software code, and government documents – it would be impossible to train today’s leading AI models without using copyrighted materials,” said OpenAI in its submission to the House of Lords communications and digital select committee (as also covered in the The Guardian). 

Meta’s new AI image generator was trained on 1.1 billion Instagram and Facebook photos.

Midjourney founder David Holz has admitted that his company did not receive consent for the hundreds of millions of images used to train its AI image generator, outraging photogarphers and artists. And a spreadsheet submitted as evidence in a copyright lawsuit against Midjourney allegedly lists thousands of artists whose images the startup's AI picture generator "can successfully mimic or imitate." 

Illustrators Sarah Andersen, Kelly McKernan, and Karla Ortiz filed suit in the Northern District of California against Midjourney Inc, DeviantArt Inc (DreamUp), and Stability A.I. Ltd (Stable Diffusion). They term these text-to-image platforms “21st-century collage tools that violate the rights of millions of artists.” 

The New York Times has sued OpenAI and Microsoft for allegedly building LLMs by copying and using millions of The Times’s copyright works through Microsoft’s “Copilot” and OpenAI’s ChatGPT, seeking to free-ride on The Times’s investment in journalism by using it to build substitutive products without permission or payment. 

Getty Images claims Stability AI ‘unlawfully’ scraped millions of images from its site.  . Getty Images argued before a UK’s House of Lords committee that “ask for forgiveness later” opt‑out mechanisms were “contrary to fundamental principles of copyright law, which requires permission to be secured in advance”.

Trade marks

AI has revolutionised advertising and marketing in terms of how products are searched for and/or ‘found’. This depends on:

·       which search methods customers use to find your products and services and how those engines select their results;

·       how voice-controlled personal assistants select products if the user asks it to buy items from a shopping list but without specifying brands (they may use buying history or prioritise products under paid promotional schemes); and

·       your brand's presence in search engine results (keywords) or other AI-controlled marketing programmes.

AI and data protection

The Information Commissioner’s Office has identified AI as a priority area and is focusing in particular on the following aspects: (i) fairness in AI; (ii) dark patterns; (iii) AI as a Service (AIaaS); (iv) AI and recommender systems; (v) biometric data and biometric technologies; and (vi) privacy and confidentiality in explainable AI.

In addition to the basic principles of UK GDPR and EU GDPR compliance at Articles 5 and 6 (lawfulness through consent, contract performance, legitimate interests; fairness and transparency; purpose limitation; data minimisation, accuracy; storage limitation; and integrity and confidentiality), AI raises a number of further issues. These include:

·       The AI provider’s role as data processor or data controller.

·       Anonymisation, pseudonymisation and other AI compliance tools:

                Taking a risk-based approach when developing and deploying AI.

                explain decisions made by AI systems to affected individuals.

                Only collecting the data needed to develop the AI system and no more.

                Addressing the risk of bias and discrimination at an early stage.

                Investing time and resource to prepare data appropriately.

                Ensuring AI systems are secure.

                Ensuring any human review of AI decisions is meaningful.

                Working with external suppliers to ensure AI use will be appropriate.

·       Profiling and automated decision-making – important to consider that human physiology is ‘normally’ distributed but human behaviour is not

                Right to object to solely auto decision, except in certain situations where you must at least have the right to human intervention anyway, with further restrictions on special categories of personal data.

·       The lawful basis for web-scraping (also being considered by the IPO in terms of copyright protection).

How to govern the use of AI?

Given the scale of the players involved in creating AI systems, and the challenges around competition and lack of explainability, there’s a very real risk of regulatory capture by Big Tech.

For evidence of Big Tech involvement in governance issues, witness the boardroom psychodrama over the governance of OpenAI and who should be its CEO, a battle won by Microsoft as a shareholder over the concerns of OpenAI’s board of directors.

To date, the incentives to achieve scale over rivals or for start-ups to get rich quick have obviously favoured early release of AI systems over concerns about the other challenges, though that may have changed with the recent decision by Google to pull the Gemini text to image system.

There’s also a cult among certain high profile venture capitalists and others in Silicon Valley, self-styled as ‘techno-optimism’. They’ve published a 'manifesto' asserting the dominance of their own self-interest, backed by a well-funded 'political action committee' making targeted political donations, supporting candidates who back their tech agenda and blocking those who don’t.

To chart a safe route for the development and deployment of AI there’s a need prioritize the public interest, and align technology with widely shared human values rather than the self-interest of a few tech enthusiasts, no matter how wealthy they are. That means uniting the AI industry, researchers and civil society around the public perspective, as advocated by The Finance Innovation Lab (of which I’m a Fellow).    

In this respect AI should be treated like aviation, health and safety, and medicines and it seems unwise for the next generation of AI to launch into unregulated territory.

There are key liability issues to be solved and mechanism for attributing and apportioning causation and liability upstream and downstream among developers, deployers and end-users.

To address concentration risk and barriers to entry there needs to be easier portability and the ability to switch among cloud providers.

In the absence of regulation, participants (and victims) will look to contract and tort law (negligence, nuisance and actions for breaches of any existing statutory duties).

Regulatory Measures

Outside the EU, the UK is a rule taker when it comes to regulating issues that have any global scale, China, EU and the US will all drive regulation, but geography and trade links means the trade bloc on the UK’s doorstep is the most important.

Examples of regulatory measures from the EU, US and China (summarised at the end of this note)  seek to draw some red lines in areas impacted by AI to at least force the industry to engage with legislators and regulators if the law is not to overly restrict development and deployment of AI. You might question the flexibility of this approach but given the risks it does seem reasonable. After all, it’s a very common tension within organisations as to whether the business units, tech developers or support teams can move more quickly on a given change project, depending on the challenges involved. So, why should the world outside AI development businesses move at the speed of the tech developers as opposed to other stakeholders (without holding AI businesses to account)? As pointed out to the House of Lords committee, developers have greatest insight into, and control over, an AI’s base model, yet downstream deployers and users may have no idea what data an AI was trained on, the nature of any testing and potential limitations on its use.

Meanwhile, the UK government’s do-nothing position is dressed up as being ‘pro-innovation’ but is at the very least a fig leaf for us being a rule-taker, and at worst demonstrates a dereliction of duty and/or regulatory capture.  Some of the UK’s 90 regulatory bodies are using their current powers to address the risks of AI (such as the ICO’s focus on the implications for privacy, as mentioned above). But the UK’s Intellectual Property Office has shelved a long-awaited code setting out rules on the training of artificial intelligence models using copyrighted material, dealing a blow to the creative industry.

How to Approach AI risk management

The following steps are involved in the process of understanding and managing the risks relating to AI:

      Perspective: developer, deployer or end-user?

      Context and end-to-end activity/processes affected

      Nature of AI system(s) involved

      Use/purpose of AI

      Sources, rights, integrity of training data

      Tolerances for inaccuracy/bias

      Sense-check for proposed human oversight/intervention

      Governance/oversight function (steering committee?)

      Testing, testing, testing

      Data licensing

      GDPR impact assessment, record of processing, privacy policy (data collected, purpose, lawful basis) and any consents

      Commercial contracts, addressing upstream and downstream rights, obligations, liability

      Controls (defect/error detection), fault analysis, complaints handling, dispute resolution

      Feedback loop for improvements

If you would like advice on any aspects of this post, please let me know.


Examples of regulatory measures from the EU, US and China

EU

EU Artificial Intelligence Act is expected to enter into force early in 2024 with a 2 year transition period. It proposes a risk-based framework for AI systems, with AI systems presenting unacceptable levels of risk being prohibited. The AI Act identifies, defines and creates detailed obligations and responsibilities for several new actors involved in the placing on the market, putting into service and use of AI systems. Perhaps the most significant of these are the definitions of “providers” and “deployers” of AI systems. The Act covers any AI output which is available within the EU and so would cover UK companies providing AI services in the EU. There is expected to be a transition period of two years before the Act is fully in force, but some provisions may come into effect earlier: six months for prohibited AI practices and 12 months for general purpose AI.

The AI Act defines an AI system as:


”...a machine-based system designed to operate with varying levels of autonomy and that may exhibit adaptiveness after deployment and that, for explicit or implicit objectives, infers, from the input it receives, how to generate outputs such as predictions, content, recommendations, or decisions that can influence physical or virtual environments.”

The AI Act prohibits ‘placing on the market’ AI systems that: use subliminal techniques, exploit vulnerabilities of specific groups of people, create a social score for a person that leads to certain types of detrimental or unfavourable treatment, or which categorise a person based on classification of their biometric data; assess persons for their likelihood to commit a criminal offence based on an assessment of their personality traits; as well as the use of real-time, remote biometric identification systems in publicly accessible spaces by or on behalf of law enforcement authorities (except to preserve life). There are also compliance requirements for high risk AI systems.

The draft AI Liability Directive and revised Product Liability Directive will clarify the rules on making claims for damage caused by an AI systemand impose a rebuttable presumption of causality on an AI system, subject to certain conditions. The two directives are intended to operate together in a complementary manner. The Directive is likely to be formally approved in early 2024 and will apply to products placed on the market 24 months after it enters into force.

EU Digital Services Act entered into force on 16 November 2022 and imposes obligations on providers of various online intermediary services, such as social media and online marketplaces. It is aimed at ensuring a safer and more open digital space for users and a level playing field for companies, including provisions banning dark patterns.

EU Digital Markets Act became fully applicable on 2 May 2023 and the European Commission has received notifications from seven companies who consider that they meet the gatekeeper thresholds

EU Machinery Products Regulation covers emerging technologies (for example, internet of things (IoT)). Although AI system risks will be regulated by the proposed AI Act (see EU Artificial Intelligence Act), the Machinery Regulation will look at whether the machinery as a whole is safe, taking into account the interactions between machinery components including AI systems. In-scope machinery and products imported into the EU from third countries (such as the UK) will need to adhere to the Machinery Regulation.

EU General Product Safety Regulation will apply from apply from 13 December 2024.

EU Data Governance Act, with effect from 23 September 2023, establishes mechanisms to enable the reuse of some public sector data. The availability of data within a controlled mechanism will be of benefit to the development of AI solutions.

The EU Data Act requires providers of products and related services to make the data generated by their products (for example, IoT devices) or services easily accessible to the user, regardless of whether the user is a business or a consumer. The user will then be able to provide the data to third parties or use it for their own purposes, including for AI purposes. The EU Data Act was published in the Official Journal on 22 December 2023 and applies from 12 September 2025.

US

In October the White House published mandatory requirements for sharing safety testing information before “the most powerful AI systems” are made public; and there are some very interesting remedies are coming out of the Federal Trade Commission such as:  

·       inquiries into Big AI activity;

·       aligning liability with ability and control (upstream liability);

·       Remedies to address incentives, ‘bright line’ rules on data/purposes:

·       AI trained on illegal data to be deleted;

·       action on voice impersonation fraud and models that harm consumers; and

·       cannot retain children’s data indefinitely, especially to train models.

China

China has addressed generative AI by requiring:

·       license to provide gen AI to the public

·       security assessment if public opinion attributes or social mobilization capabilities in the model

·       uphold integrity of state power, not incite secession, safeguard national unity, preserve economic/social order, align with socialist values

·       Additional interim measures that also focus on other countries’ concerns around AI impact:

o   IP protection

o   Transparency, and

o   Non-discrimination

While we might not agree with the sort of cultural control being imposed by Chinese legislators in the context of generative AI, they perhaps point to a model for how to introduce western civil society concepts into our legislation.


Tuesday 5 March 2024

Pay-or-Consent Ignores the Elephant-in-the-Room

European consumer bodies have united to file 8 local data protection complaints against Meta, claiming that "to ask consumers using Facebook and Instagram to give their consent to the processing of their personal data for advertising purposes or alternatively to pay a fee of up to €311 per year" does not cure various problems under the General Data Protection Regulation in the way it processes their customers' personal data. This also likely affects the status of training data that Meta has drawn from Facebook and Instagram to power it artificial intelligence systems. Previous complaints have resulted in changes to Meta privacy policies, but no real change in the underlying data collection and processing. Customers' investment of time and effort in their accounts and Meta's market dominance makes switching unrealistic. If the complaints are successful, it would suggest both free and paid-for functionality will be much more limited in future, but perhaps subscription revenue might make up for any lost ad revenue. Meta obviously disputes the claims.

The consumer bodies say that Meta collects way more personal data about its users than is necessary for the purposes claimed, such as performing its contracts with users, and this also fails to meet the GDPR requirement to minimise the personal data collected. 

In addition, there is too little transparency and explanation of the use or purpose for collecting each type of personal data, and the legal basis relied upon. That would mean Meta isn't clear what types of data must be processed for contractual purposes and which types are covered by user consent, for example. It would also mean that any consent relied upon was not fully informed and therefore was not validly given.

While this calls into question the ability for Facebook and Instagram can use their customer's personal data to power behavioural advertising and the related revenues, it would also taint the use of such personal data as training data for Meta's AI tools and systems.

The claims in more detail (which Meta obviously would deny strenuously) are:

  • Meta’s personal data processing for advertising purposes lacks a valid legal basis because it relies on consent which has not been validly collected for the purposes of the GDPR; 
  • Some of Meta’s processing for advertising purposes appears to rely invalidly on contract; 
  • Meta cannot account for the lawfulness of its processing for content personalisation since it is not clear – and there is no way to verify – that all of Meta’s profiling for that purpose is (a) necessary for the relevant contract and (b) consistent with the principle of data minimisation; 
  • It is not clear – and there is no way to verify – that all of Meta’s profiling for advertising purposes is necessary for that purpose and therefore consistent with the principle of data minimisation; 
  • Meta’s processing in general is not consistent with the principles of transparency and purpose limitation; and 
  • Meta’s lack of transparency, unexpected processing, use of its dominant position to force consent, and switching of legal bases in ways which frustrate the exercise of data subject rights, are not consistent with the principle of fairness.

Previous complaints have resulted in changes to privacy policies, to try to clarify the purpose and legal basis of processing, but the consumer bodies say this has not interrupted the underlying processing that they say is illegal. Meta would obviously dispute this. 

While it's tempting to think users can simply vote with their feet, the amount of time consumers have invested in their accounts - and Meta's market dominance - means that is not a realistic option.

If the complaints are successful, it would suggest both free and paid-for functionality will be much more limited in future, but perhaps subscription revenue might make up for any lost ad revenue...

What this space.