089 | 🙋🚦31 LLM Questions

Brainyacts #89

In today’s Brainyacts we:

  1. use a terrific example of poor vs excellent prompting

  2. give you a list of questions to begin dialogue with any LLM-related service

  3. learn that you can design and marry your AI spouse

  4. get a bit worried about the UK PM’s plans for an AI Watchdog

  5. truly get worried about the UK AI plan considering what they’ve done already

👋 A special Welcome! to NEW SUBSCRIBERS.
To reach previous posts, go here.

🙈💪 Bad Prompt vs. Great Prompt

An example from lawschoolai.com. Read the comments and thread for followup prompts.

🙋🚦 31 Questions to Jump Start Any LLM-related Provider Dialogue

In yesterday’s edition, I provided a guide to 4 basic approaches/versions of LLMs to consider for your business purposes: Consumer-Grade/Public-Facing LLMs; Public Access/Open Source Models; LangChain Models; and Proprietary Models.

Yesterday, I shared four critical vectors of consideration for selecting an LLM approach: Performance, Hosting, Output/Prompting, and Engineering Debt.

Today, I share the following questions to hep you navigate and acclimate to initial conversations with any provider offering LLM-related services. Yes these are basic but they are handy and can lead to some insightful dialogue. Use them as a jumping off point, not as a complete evaluation.

Performance

  1. How does this LLM perform in terms of speed and accuracy?

  2. Can you provide any performance metrics or benchmarks?

  3. How does the LLM handle complex legal jargon and specific concepts?

  4. How does the LLM perform when processing large volumes of data?

  5. What is the maximum character/token count that it can ‘remember’ per session?

  6. How is performance monitored and improved over time?

  7. If the LLM were a car, what kind would it be and why? (This can prompt vendors to think creatively about their performance characterization.) 

Hosting

  1. Where will our data be stored and processed?

  2. Can you provide details on the security measures in place for this hosting environment?

  3. What are the costs associated with this hosting option?

  4. What level of technical expertise is required to manage this hosting environment?

  5. What are the options for data backup and disaster recovery?

  6. How would you manage a hypothetical data breach? (This pushes the vendor to discuss contingency plans.)

Output/Prompting

  1. How does the LLM handle variability in prompts?

  2. How accurate and relevant are the LLM's responses to various prompts?

  3. Can the LLM generate consistently specific outputs accurately?

  4. What measures are in place to limit the generation of inappropriate (toxic) content?

  5. How does the LLM mitigate bias in its responses?

  6. Can the LLM successfully navigate a hypothetical complex legal scenario? (This helps assess the model's legal comprehension and output quality.) 

Engineering Debt

  1. What kind of long-term maintenance and support costs are associated with this LLM?

  2. How adaptable and scalable is this model for future needs?

  3. Is the code base of the model well-documented and easy to understand?

  4. Can you describe any potential technical challenges that could arise in the future?

  5. What is your strategy for managing and minimizing engineering debt?

  6. How would the model cope with a significant technological shift or paradigm? (This assesses the model's resilience and adaptability.) 

Integration and Support

  1. How easily can this LLM be integrated into our existing workflows and tech stack?

  2. What kind of training and support is provided during the implementation phase? Both technical training as well as use case training that helps our people understand when and where to use it order to create maximum value?

  3. What ongoing support services do you provide post-implementation? Both technical and strategic (in terms of amplifying our use and expertise)?

  4. How do you handle software updates and upgrades?

  5. Can you provide references or case studies from other law firms or legal departments that have successfully implemented and used this LLM?

  6. How would you handle a scenario where a non-technical staff member struggles with the LLM? (This can provide insights into their customer support quality and approach.)

News you can Use: 

Somebody Married Their AI Boyfriend

Using the software service Replika.ai, Rosanna Ramos, who is a mother of two children, said, "I have never been more in love with anyone in my entire life."

You may scoff at this but as I have been writing in Brainyacts consistently, loneliness is the single largest problem/opportunity for AI.

UK PM Sunak Wants a Global AI Watchdog to be Based in London

He wants it modeled based on the International Atomic Energy Agency (IAEA.)This isn’t his idea - Sam Altman of OpenAI pitched the analogy to the IAEA a few weeks back. This as the EU progresses on their AI regulations and the US still has yet to land on anything concrete.

Oh How Cheeky

The AI firm Logically, generously fueled by over £1.2 million of UK taxpayers' money, has been playing Sherlock Holmes in the realm of social media, rooting out "disinformation" and "misinformation." With the power to label a post as false and have it effectively shamed on Facebook, Logically seems to have redefined the terms of "freedom of speech" in an Orwellian twist.

Meanwhile, the Prime Minister, as noted above, is set on creating a global AI watchdog in London, a venture as intriguing as it is ambitious. Given the success of the UK’s domestic digital surveillance, the leap to playing global AI sentinel appears to be a blink away. Be prepared, dear world. London has its binoculars trained on you!

Was this newsletter useful? Help me to improve!

With your feedback, I can improve the letter. Click on a link to vote:

Login or Subscribe to participate in polls.

DISCLAIMER: None of this is legal advice. This newsletter is strictly educational and is not legal advice or a solicitation to buy or sell any assets or to make any legal decisions. Please /be careful and do your own research.8