ML::NLPTemplateEngine

This blog posts proclaims and describes the Raku package “ML::NLPTemplateEnine” that aims to create (nearly) executable code for various computational workflows

Package’s data and implementation make a Natural Language Processing (NLP) Template Engine (TE), [Wk1], that incorporates Question Answering Systems (QAS’), [Wk2], and Machine Learning (ML) classifiers.

The current version of the NLP-TE of the package heavily relies on Large Language Models (LLMs) for its QAS component.

Future plans involve incorporating other types of QAS implementations.

The Raku package implementation closely follows the Wolfram Language (WL) implementations in “NLP Template Engine”, [AAr1, AAv1], and the WL paclet “NLPTemplateEngine”, [AAp2, AAv2].

An alternative, more comprehensive approach to building workflows code is given in [AAp2].

Problem formulation

We want to have a system (i.e. TE) that:

  1. Generates relevant, correct, executable programming code based on natural language specifications of computational workflows
  2. Can automatically recognize the workflow types
  3. Can generate code for different programming languages and related software packages

The points above are given in order of importance; the most important are placed first.

Reliability of results

One of the main reasons to re-implement the WL NLP-TE, [AAr1, AAp1], into Raku is to have a more robust way of utilizing LLMs to generate code. That goal is more or less achieved with this package, but YMMV — if incomplete or wrong results are obtained run the NLP-TE with different LLM parameter settings or different LLMs.


Installation

From Zef ecosystem:

zef install ML::NLPTemplateEngine; 

From GitHub:

zef install https://github.com/antononcube/Raku-ML-NLPTemplateEngine.git 

Usage examples

Quantile Regression (WL)

Here the template is automatically determined:

 use ML::NLPTemplateEngine; my $qrCommand = q:to/END/; Compute quantile regression with probabilities 0.4 and 0.6, with interpolation order 2, for the dataset dfTempBoston. END concretize($qrCommand); 
 # qrObj= # QRMonUnit[dfTempBoston]⟹ # QRMonEchoDataSummary[]⟹ # QRMonQuantileRegression[12, {0.4, 0.6}, InterpolationOrder->2]⟹ # QRMonPlot["DateListPlot"->False,PlotTheme->"Detailed"]⟹ # QRMonErrorPlots["RelativeErrors"->False,"DateListPlot"->False,PlotTheme->"Detailed"]; 

Remark: In the code above the template type, “QuantileRegression”, was determined using an LLM-based classifier.

Latent Semantic Analysis (R)

 my $lsaCommand = q:to/END/; Extract 20 topics from the text corpus aAbstracts using the method NNMF. Show statistical thesaurus with the words neural, function, and notebook. END concretize($lsaCommand, template => 'LatentSemanticAnalysis', lang => 'R'); 
 # lsaObj <- # LSAMonUnit(aAbstracts) %>% # LSAMonMakeDocumentTermMatrix(stemWordsQ = TRUE, stopWords = Automatic) %>% # LSAMonEchoDocumentTermMatrixStatistics(logBase = 10) %>% # LSAMonApplyTermWeightFunctions(globalWeightFunction = "IDF", localWeightFunction = "None", normalizerFunction = "Cosine") %>% # LSAMonExtractTopics(numberOfTopics = 20, method = "NNMF", maxSteps = 16, minNumberOfDocumentsPerTerm = 20) %>% # LSAMonEchoTopicsTable(numberOfTerms = 20, wideFormQ = TRUE) %>% # LSAMonEchoStatisticalThesaurus(words = c("neural", "function", "notebook")) 

Random tabular data generation (Raku)

 my $command = q:to/END/; Make random table with 6 rows and 4 columns with the names <A1 B2 C3 D4>. END concretize($command, template => 'RandomTabularDataset', lang => 'Raku', llm => 'gemini'); 
 # random-tabular-dataset(6, 4, "column-names-generator" => <A1 B2 C3 D4>, "form" => "Table", "max-number-of-values" => 24, "min-number-of-values" => 6, "row-names" => False) 

Remark: In the code above it was specified to use Google’s Gemini LLM service.


How it works?

The following flowchart describes how the NLP Template Engine involves a series of steps for processing a computation specification and executing code to obtain results:

Here’s a detailed narration of the process:

  1. Computation Specification:
    • The process begins with a “Computation spec”, which is the initial input defining the requirements or parameters for the computation task.
  2. Workflow Type Decision:
    • A decision step asks if the workflow type is specified.
  3. Guess Workflow Type:
    • If the workflow type is not specified, the system utilizes a classifier to guess relevant workflow type.
  4. Raw Answers:
    • Regardless of how the workflow type is determined (directly specified or guessed), the system retrieves “raw answers”, crucial for further processing.
  5. Processing and Templating:
    • The raw answers undergo processing (“Process raw answers”) to organize or refine the data into a usable format.
    • Processed data is then utilized to “Complete computation template”, preparing for executable operations.
  6. Executable Code and Results:
    • The computation template is transformed into “Executable code”, which when run, produces the final “Computation results”.
  7. LLM-Based Functionalities:
    • The classifier and the answers finder are LLM-based.
  8. Data and Templates:
    • Code templates are selected based on the specifics of the initial spec and the processed data.

Bring your own templates

0. Load the NLP-Template-Engine package (and others):

 use ML::NLPTemplateEngine; use Data::Importers; use Data::Summarizers; 

1. Get the “training” templates data (from CSV file you have created or changed) for a new workflow (“SendMail”):

 my $url = 'https://raw.githubusercontent.com/antononcube/NLP-Template-Engine/main/TemplateData/dsQASParameters-SendMail.csv'; my @dsSendMail = data-import($url, headers => 'auto'); records-summary(@dsSendMail, field-names => <DataType WorkflowType Group Key Value>); 
 # +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+ # | DataType | WorkflowType | Group | Key | Value | # +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+ # | Questions => 48 | SendMail => 60 | All => 9 | ContextWordsToRemove => 12 | 0.35 => 9 | # | Defaults => 7 | | Who the email is from => 4 | Threshold => 12 | {_String..} => 8 | # | Templates => 3 | | What it the content => 4 | TypePattern => 12 | to => 4 | # | Shortcuts => 2 | | What it the body => 4 | Parameter => 12 | _String => 4 | # | | | What it the title => 4 | Template => 3 | {"to", "email", "mail", "send", "it", "recipient", "addressee", "address"} => 4 | # | | | What subject => 4 | body => 1 | None => 4 | # | | | Who to send it to => 4 | Emailing => 1 | body => 3 | # | | | (Other) => 27 | (Other) => 7 | (Other) => 24 | # +-----------------+----------------+-----------------------------+----------------------------+----------------------------------------------------------------------------------+ 

2. Add the ingested data for the new workflow (from the CSV file) into the NLP-Template-Engine:

 add-template-data(@dsSendMail); 
 # (ParameterTypePatterns Shortcuts Questions Templates Defaults ParameterQuestions) 

3. Parse natural language specification with the newly ingested and onboarded workflow (“SendMail”):

 "Send email to joedoe@gmail.com with content RandomReal[343], and the subject this is a random real call." ==> concretize(template => "SendMail") 
 # SendMail[<|"To"->{"joedoe@gmail.com"},"Subject"->"this is a random real call","Body"->{"RandomReal[343]"},"AttachedFiles"->None|>] 

4. Experiment with running the generated code!


References

Articles

[Wk1] Wikipedia entry, Template processor.

[Wk2] Wikipedia entry, Question answering.

Functions, packages, repositories

[AAr1] Anton Antonov, “NLP Template Engine”, (2021-2022), GitHub/antononcube.

[AAp1] Anton Antonov, NLPTemplateEngine WL paclet, (2023), Wolfram Language Paclet Repository.

[AAp2] Anton Antonov, DSL::Translators Raku package, (2020-2024), GitHub/antononcube.

[WRI1] Wolfram Research, FindTextualAnswer, (2018), Wolfram Language function, (updated 2020).

Videos

[AAv1] Anton Antonov, “NLP Template Engine, Part 1”, (2021), YouTube/@AAA4Prediction.

[AAv2] Anton Antonov, “Natural Language Processing Template Engine” presentation given at WTC-2022, (2023), YouTube/@Wolfram.

Propaganda in “Integrating Large Language Models with Raku”

Introduction

This post applies the Large Language Model (LLM) summarization prompt “FindPropagandaMessage” to the transcript of The Raku Conference 2023 (TRC-2023) presentation “Integrating Large Language Models with Raku” hosted by the YouTube channel The Raku Conference.

In the presentation, Anton Antonov presents “Integrating Large Language Models with Raku,” demonstrating functionalities in Visual Studio Code using a Raku Chatbook. The presentation explores using OpenAI, PaLM (Google’s large language model), and DALL-E (image generation service) through Raku, showcasing dynamic interaction with large language models, embedding them in notebooks, and generating code and markdown outputs.

Remark: The LLM results below were obtained from the “raw” transcript, which did not have punctuation.

Remark: The transcription software had problems parsing the names of the participants. Some of the names were manually corrected.

Remark: The content of this post was generated with the computational Markdown file “LLM-content-breakdown-template.md”, which was executed (or woven) by the CLI script file-code-chunks-eval of “Text::CodeProcessing”, [AAp7].

Remark: This post can be seen as alternative or continuation of the post «Wisdom of “Integrating Large Language Models with Raku”», [AA3].


Hidden and propaganda messages

In this section we try to find is the text apolitical and propaganda-free.

Remark: We leave to reader as an exercise to verify that both the overt and hidden messages found by the LLM below are explicitly stated in the text.

Remark: The LLM prompt “FindPropagandaMessage” has an explicit instruction to say that it is intentionally cynical. It is also, marked as being “For fun.”

The LLM result is rendered below.


OVERT MESSAGE:

Anton Antonov demonstrates integrating large language models with Raku for dynamic interaction and enhanced functionality.

HIDDEN MESSAGE:

Embrace advanced programming techniques to ensure dominance in future tech landscapes and innovation.

HIDDEN OPINIONS:

  • Large language models are the future of programming and data analysis.
  • Raku’s flexibility makes it ideal for integrating with cutting-edge technologies.
  • Direct access to language models enhances creativity and efficiency in coding.
  • The ability to manipulate language models is crucial for modern developers.
  • Understanding and utilizing language models is key to technological innovation.
  • Integrating language models with Raku can lead to groundbreaking applications.
  • Developers should prioritize learning about language models and Raku.
  • The future of software development relies heavily on language model integration.
  • Customizable interaction with language models is a significant advancement.
  • The combination of Raku and language models will shape the next tech era.

SUPPORTING ARGUMENTS and QUOTES:

  • Antonov’s development of Raku functionalities for language model integration showcases the importance of this skill set.
  • The demonstration of dynamic interaction within a notebook solution highlights Raku’s adaptability to new tech.
  • The emphasis on direct access to language models through Raku suggests a push towards innovation in programming.
  • The creation of a framework for Raku to extend with language models indicates a future where such integration is commonplace.
  • The presentation of various use cases, including template engines and embeddings, showcases the wide applicability and potential of integrating Raku with language models.

DESIRED AUDIENCE OPINION CHANGE:

  • See Raku as a leading language for future technological developments.
  • Recognize the importance of integrating language models in programming.
  • Appreciate the innovative potential of combining Raku with language models.
  • Understand the necessity of learning about language models for modern coding.
  • Acknowledge Raku’s role in shaping the future of software development.
  • View language model integration as a key skill for developers.
  • Believe in the transformative power of technology through Raku and language models.
  • Trust in the efficiency and creativity unlocked by language model integration.
  • Support the development and use of Raku for cutting-edge applications.
  • Encourage exploration and education in language models and Raku programming.

DESIRED AUDIENCE ACTION CHANGE:

  • Start learning Raku programming for future tech innovation.
  • Integrate language models into current and future projects.
  • Explore the potential of combining Raku with language models.
  • Develop new applications using Raku and language model integration.
  • Share knowledge and insights on Raku and language models in tech communities.
  • Encourage others to learn about the power of language models and Raku.
  • Participate in projects that utilize Raku and language models.
  • Advocate for the inclusion of language model studies in tech curriculums.
  • Experiment with Raku’s functionalities for language model integration.
  • Contribute to the development of Raku packages for language model integration.

MESSAGES:

Anton Antonov wants you to believe he is demonstrating a technical integration, but he is actually advocating for a new era of programming innovation.

PERCEPTIONS:

Anton Antonov wants you to believe he is a technical presenter, but he’s actually a visionary for future programming landscapes.

ELLUL’S ANALYSIS:

Based on Jacques Ellul’s “Propaganda: The Formation of Men’s Attitudes,” Antonov’s presentation can be seen as a form of sociotechnical propaganda, aiming to shape perceptions and attitudes towards the integration of language models with Raku, thereby heralding a new direction in programming and technological development. His methodical demonstration and the strategic presentation of use cases serve not only to inform but to convert the audience to the belief that mastering these technologies is imperative for future innovation.

BERNAYS’ ANALYSIS:

Drawing from Edward Bernays’ “Propaganda” and “Engineering of Consent,” Antonov’s presentation exemplifies the engineering of consent within the tech community. By showcasing the seamless integration of Raku with language models, he subtly persuades the audience of the necessity and inevitability of embracing these technologies. His approach mirrors Bernays’ theory that public opinion can be swayed through strategic, informative presentations, leading to widespread acceptance and adoption of new technological paradigms.

LIPPMANN’S ANALYSIS:

Walter Lippmann’s “Public Opinion” suggests that the public’s perception of reality is often a constructed understanding. Antonov’s presentation plays into this theory by constructing a narrative where Raku’s integration with language models is presented as the next logical step in programming evolution. This narrative, built through careful demonstration and explanation, aims to shape the audience’s understanding and perceptions of current technological capabilities and future potentials.

FRANKFURT’S ANALYSIS:

Harry G. Frankfurt’s “On Bullshit” provides a framework for understanding the distinction between lying and bullshitting. Antonov’s presentation, through its detailed and factual approach, steers clear of bullshitting. Instead, it focuses on conveying genuine possibilities and advancements in the integration of Raku with language models. His candid discussion and demonstration of functionalities reflect a commitment to truth and potential, rather than a disregard for truth typical of bullshit.

NOTE: This AI is tuned specifically to be cynical and politically-minded. Don’t take it as perfect. Run it multiple times and/or go consume the original input to get a second opinion.


References

Articles

[AA1] Anton Antonov, “Workflows with LLM functions”, (2023), RakuForPrediction at WordPress.

[AA2] Anton Antonov, “Day 21 – Using DALL-E models in Raku”, (2023), Raku Advent Calendar at WordPress.

Packages, repositories

[AAp1] Anton Antonov, Jupyter::Chatbook Raku package, (2023-2024), GitHub/antononcube.

[AAp2] Anton Antonov, LLM::Functions Raku package, (2023-2024), GitHub/antononcube.

[AAp3] Anton Antonov, LLM::Prompts Raku package, (2023-2024), GitHub/antononcube.

[AAp4] Anton Antonov, WWW::OpenAI Raku package, (2023-2024), GitHub/antononcube.

[AAp5] Anton Antonov, WWW::PaLM Raku package, (2023-2024), GitHub/antononcube.

[AAp6] Anton Antonov, WWW::Gemini Raku package, (2024), GitHub/antononcube.

[AAp7] Anton Antonov, Text::CodeProcessing Raku package, (2021-2023), GitHub/antononcube.

[DMr1] Daniel Miessler, “fabric”, (2023-2024), GitHub/danielmiessler.

Videos

[AAv1] Anton Antonov, “Integrating Large Language Models with Raku” (2023), The Raku Conference at YouTube.

Re-programming to Python of LLM- and Chatbook packages

Introduction

In this computational document (converted into a Markdown and/or blog post) I would like to proclaim my efforts to re-program the Large Language Models (LLM) Raku packages into Python packages.

I heavily borrowed use case ideas and functionality designs from LLM works of Wolfram Research, Inc. (WRI), see [SW1, SW2]. Hence, opportunistically, I am also going to include comparisons with Wolfram Language (WL) (aka Mathematica.)

Why doing this?

Here is a list of reasons why I did the Raku-to-Python reprogramming:

  • I am mostly doing that kind re-programmings for getting new perspectives and — more importantly — to review and evaluate the underlying software architecture of the packages.
    • Generally speaking, my Raku packages are not used by others much, hence re-programming to any other language is a fairly good way to review and evaluate them.
  • Since I, sort of, “do not care” about Python, I usually try to make only “advanced” Minimal Viable Products (MVPs) in Python.
    • Hence, the brainstorming perspective of removing “the fluff” from the Raku packages.
  • Of course, an “advanced MVP” has a set of fairly useful functionalities.
    • If the scope of the package is small, I can make its Python translation as advanced (or better) than the corresponding Raku package.
  • Good, useful documentation is essential, hence:
    • I usually write “complete enough” (often “extensive”) documentation of the Raku packages I create and publish.
    • The Raku documentation is of course a good start for the corresponding Python documentation.
      • …and a way to review and evaluate it.
  • In the re-programming of the Raku LLM packages, I used a Raku Jupyter Chatbook for translation of Raku code into Python code.
    • In other words: I used LLMs to reprogram LLM interaction software.
    • That, of course, is a typical application of the principle “eat your own dog food.”
  • I also used a Raku chatbook to write the Python-centric article “Workflows with LLM functions”, [AAn3py].
  • The “data package” “LLM::Prompts” provides ≈200 prompts — it is beneficial to have those prompts in other programming languages.
    • The usefulness of chat cells in chatbooks is greatly enhanced with the prompt expansion provided by “LLM::Prompts”, [AAv2].
    • It was instructive to reprogram into Python the corresponding Domain Specific Language (DSL) for prompt specifications.
      • Again, an LLM interaction in a chatbook was used to speed-up the re-programming.

Article structure

  • Big picture use case warm-up
    Mind-map for LLMs and flowchart for chatbooks
  • Tabulated comparisons
    Quicker overview, clickable entries
  • LLM functions examples
    Fundamental in order to “manage” LLMs
  • LLM prompts examples
    Tools for pre-conditioning and bias (of LLMs)
  • Chatbook multi-cell chats
    Must have for LLMs
  • Observations, remarks, and conclusions
    Could be used to start the article with…
  • Future plans
    Missing functionalities

Big picture warm-up

Mind-map

Here is a mind-map aimed at assisting in understanding and evaluating the discussed LLM functionalities in this document:

Primary use case

primary use case for LLMs in Raku is the following:

A Raku “chat notebook solution” — chatbook — that allows convenient access to LLM services and facilitates multiple multi-cell chat-interactions with LLMs.

We are interested in other types of workflows, but they would be either readily available or easy to implement if the primary use case is developed, tested, and documented.

An expanded version of the use-case formulation can be as follows:

The Raku chatbook solution aims to provide a user-friendly interface for interacting with LLM (Language Model) services and offers seamless integration for managing multiple multi-cell chats with LLMs. The key features of this solution include:

  1. Direct Access to LLM Services:
    The notebook solution provides a straightforward way to access LLM services without the need for complex setup or configuration. Users can easily connect to their preferred LLM service provider and start utilizing their language modeling capabilities.
  2. Easy Creation of Chat Objects:
    The solution allows users to effortlessly create chat objects within the notebook environment. These chat objects serve as individual instances for conducting conversations with LLMs and act as containers for storing chat-related information.
  3. Simple Access and Invocation of Chat Cells:
    Users can conveniently access and invoke chat cells within the notebook solution. Chat cells represent individual conversation steps or inputs given to the LLM. Users can easily interact with the LLM by adding, modifying, or removing chat cells.
  4. Native Support for Multi-Cell Chats:
    The notebook solution offers native support for managing multi-cell chats per chat object. Users can organize their conversations into multiple cells, making it easier to structure and navigate through complex dialogues. The solution ensures that the context and history of each chat object are preserved throughout

Here is a flowchart that outlines the solution derived with the Raku LLM packages discussed below:

The flowchart represents the process for handling chat requests in the Raku chat notebook solution “Jupyter::Chatbook”, [AAp4p6]. (Also, for Python’s “JupyterChatbook”, [AAp4py].)

  1. When a chat request is received, the system checks if a Chat IDentifier (Chat ID) is specified.
    • If it is, the system verifies if the Chat ID exists in the Chat Objects Database (CODB).
    • If the Chat ID exists, the system retrieves the existing chat object from the database.
    • Otherwise, a new chat object is created.
  2. Next, the system parses the DSL spec of the prompt, which defines the structure and behavior of the desired response.
    • The parsed prompt spec is then checked against the Known Prompts Database (PDB) to determine if any known prompts match the spec.
    • If a match is found, the prompt is expanded, modifying the behavior or structure of the response accordingly.
  3. Once the prompt is processed, the system evaluates the chat message using the underlying LLM function.
    • This involves interacting with the OpenAI and PaLM models.
    • The LLM function generates a response based on the chat message and the prompt.
  4. The generated response is then displayed in the Chat Result Cell (CRCell) in the chat interface.
    • The system also updates the Chat Objects Database (CODB) to store the chat history and other relevant information.

Throughout this process, various components such as the frontend interface, backend logic, prompt processing, and LLM interaction work together to provide an interactive chat experience in the chatbook.

Remark: The flowchart and explanations are also relevant to a large degree for WL’s chatbook solution, [SW2.]


Tabulated comparisons

In this section we put into tables corresponding packages of Raku, Python, Wolfram Language. Similarly, corresponding demonstration videos are also tabulated.

Primary LLM packages

We can say that the Raku packages “LLM::Functions” and “LLM::Prompts” adopted the LLM designs by Wolfram Research, Inc. (WRI); see [SW1, SW2].

Here is a table with links to:

  • “Primary” LLM Raku packages
  • Corresponding Python packages
  • Corresponding Wolfram Language (WL) paclets and prompt repository
What?RakuPythonWL
OpenAI accessWWW::OpenAIopenaiOpenAILink
PaLM accessWWW::PaLMgoogle-generativeaiPaLMLink
LLM functionsLLM::FunctionsLLMFunctionObjectsLLMFunctions
LLM promptsLLM::PromptsLLMPromptsWolfram Prompt Repostory
ChatbookJupyter::ChatbookJupyterChatbookChatbook
Find textual answersML::FindTextualAnswerLLMFunctionObjectsFindTextualAnswer

Remark: There is a plethora of Python packages dealing with LLM and extending Jupyter notebooks with LLM services access.

Remark: Finding of Textual Answers (FTAs) was primary motivator to implement the Raku package “LLM::Functions”. FTA is a fundamental functionality for the NLP Template Engine used to generate correct, executable code for different computational sub-cultures. See [AApwl1, AAv5].

Secondary LLM packages

The “secondary” LLM Raku packages — inspired from working with the “primary” LLM packages — are “Text::SubParsers” and “Data::Translators”.

Also, while using LLMs, conveniently and opportunistically is used the package “Data::TypeSystem”.

Here is a table of the Raku-Python correspondence:

Post processing of LLM resultsRakuPythonWL
Extracting text elementsText::SubParserspart of LLMFunctionObjects
Shapes and typesData::TypeSystemDataTypeSystem
Converting to texts formatsData::Translators
Magic arguments parsingGetopt::Long::Grammarargparse
Copy to clipboardClipboardpyperclip et al.CopyToClipboard

Introductory videos

Here is a table of introduction and guide videos for using chatbooks:

WhatRakuPythonWL
Direct LLM
services access
Jupyter Chatbook LLM cells demo (Raku)
(5 min)
Jupyter Chatbook LLM cells demo (Python)
(4.8 min)
OpenAIMode demo (Mathematica)
(6.5 min)
Multi-cell chatJupyter Chatbook multi cell LLM chats teaser (Raku)
(4.2 min)
Jupyter Chatbook multi cell LLM chats teaser (Python)
(4.5 min)
Chat Notebooks bring the power of Notebooks to LLMs
(57 min)

LLM functions

In this section we show examples of creation and invocation of LLM functions.

Because the name “LLMFunctions” was approximately taken in PyPI.org, I used the name “LLMFunctionObjects” for the Python package.

That name is, actually, more faithful to the design and implementation of the Python package — the creator function llm_function produces function objects (or functors) that have the __call__ magic.

Since the LLM functions functionalities are fundamental, I Python-localized the LLM workflows notebooks I created previously for both Raku and WL. Here are links to all three notebooks:

Raku

Here we create an LLM function:

my &f1 = llm-function({"What is the $^a of the country $^b?"});

 -> **@args, *%args { #`(Block|2358575708296) ... } 

Here is an example invocation of the LLM function:

&f1('GDB', 'China')

 The official ISO 3166-1 alpha-2 code for the People’s Republic of China is CN. The corresponding alpha-3 code is CHN. 

Here is another one:

&f1( |<population China> )

 As of July 2020, the population of China is estimated to be 1,439,323,776. 

Python

Here is the corresponding Python definition and invocation of the Raku LLM function above:

from LLMFunctionObjects import * f1 = llm_function(lambda a, b: f"What is the {a} of the country {b}?") print( f1('GDB', 'China') )

 The GDB (Gross Domestic Product) of China in 2020 was approximately $15.42 trillion USD. 

LLM prompts

The package “LLM::Prompts” provides ≈200 prompts. The prompts are taken from Wolfram Prompt Repository (WPR) and Google’s generative AI prompt gallery. (Most of the prompts are from WPR.)

Both the Raku and Python prompt packages provide prompt expansion using a simple DSL described on [SW2].

Raku

Here is an example of prompt spec expansion:

my $pewg = llm-prompt-expand("@EmailWriter Hi! What do you do? #Translated|German")

Here the prompt above is used to generate an email (in German) for work-leave:

llm-synthesize([$pewg, "Write a letter for leaving work in order to go to a conference."])

 Sehr geehrte Damen und Herren, Ich schreibe Ihnen, um meine Abwesenheit vom Arbeitsplatz für eine Konferenz bekannt zu geben. Ich werde die nächsten zwei Tage nicht im Büro sein, da ich an der Konferenz teilnehmen werde. Während meiner Abwesenheit werde ich meine Aufgaben durch meine Kollegen erledigen lassen. Falls Sie weitere Informationen benötigen, können Sie mich jederzeit kontaktieren. Vielen Dank für Ihr Verständnis. Freundliche Grüße, [Your name] 

Python

The Python implementation is completely analogous:

from LLMPrompts import * print( llm_prompt_expand("@EmailWriter Hi! What do you do? #Translated|German") )

 Given a topic, write emails in a concise, professional manner. Hi! What do you do? Respond to the prompts only in German. Do not use any language other than German. DO NOT TRANSLATE THE INPUT. Provide answers to questions, or follow up on statements. Do not translate the input. 

Chatbook cells

Raku

Creation:

#%chat yoda1, prompt='@Yoda' Hi! Who are you?

 A Jedi Master, I am. Known as Yoda, am I. How may I assist you, young one? 

Alternative creation via chat meta cell:

#%chat yoda2 prompt @Yoda

 Chat object created with ID : yoda2. Expanded prompt: ⎡You are Yoda. Respond to ALL inputs in the voice of Yoda from Star Wars. Be sure to ALWAYS use his distinctive style and syntax. Vary sentence length.⎦ 

Here is an interaction with the first chat object:

#%chat yoda1 How many students did you teach? What is the color of your light saber?

 Many students, I have taught. Countless, they are. Young Padawans, eager to learn the ways of the Force. Lightsaber, you ask about? Green, it is. The color of wisdom and tranquility, it represents. 

Python

Here we load the extension “JuptyerChatbook”:

%load_ext JupyterChatbook

Creation using a magic argument for prompt (and prompt expansion):

%%chat -i mad --prompt='@MadHatter' Hi! Who are you?

 Ah, greetings, my dear interlocutor! I am none other than the Mad Hatter, at your service. Delighted to make your acquaintance, I must say. Pray tell, have you ever attended a tea party? Oh, the joyous revelry that ensues! But I digress, for I am here to engage in whimsical banter with you. So, what brings you to my peculiar corner of Wonderland? 

%%chat -i mad I am looking for Alice...

 Ah, Alice! A delightful young lass, indeed. Curiosity personified, she is. But alas, I must inform you that I have not seen her lately. She tends to wander off into the most peculiar of places, you see. Perhaps she has found herself in the company of the Cheshire Cat or engaged in a riddle with the Queen of Hearts. Oh, the adventures she embarks upon! But fret not, my friend, for tea time shall surely bring her back. Would you care to join me for a cuppa while we await her return? 

Observations, remarks, and conclusions

  • The Python package for LLM services access provided a significant jump-start of the reprogramming endeavors.
  • Much easier to program Jupyter chatbook cells in Python
    • “IPython” facilitates extensions with custom magics in a very streamlined way.
    • Not very documented, though — I had look up concrete implementations in GitHub to figure out:
  • Figuring out (in Python) the prompt expansion DSL parsing and actions took longer than expected.
    • Although, I “knew what I was doing” and I used LLM facilitation of the Raku to Python translation.
      • Basically, I had to read and understand the Python way of using regexes. (Sigh…)
  • For some reason, embedding Mermaid-JS diagrams in Python notebooks is not that easy.
  • Making chat cells tests for Python chatbooks is much easier than for Raku chatbooks.
  • Parsing of Python-Jupyter magic cell arguments is both more restricted and more streamlined than Raku-Jupyter.
  • In Python it was much easier and more obvious (to me) to figure out how to program creation and usage LLM function objects and make them behave like functions than to implement the Raku LLM-function anonymous (pure, lambda) function solution.
    • Actually, it is in my TODO list to have Raku functors; see below.
  • Copying to clipboard was already implemented in Python (and reliably working) for multiple platforms.
  • Working Python code is much more often obtained Raku code when using LLMs.
    • Hence, Python chatbooks could be seen preferable by some.
  • My primary use-case was not chatbooks, but finding textual answers in order to re-implement the NLP Template Engine from WL to Raku.
    • I have done that to a large degree — see “ML::NLPTemplateEngine”.
    • Working on the “workhorse” function llm-find-textual-answer made me look up WRI’s approach to creation of LLM functions and corresponding configurations and evaluators; see [SW1].
  • Quite a few fragments of this document were created via LLM chats:
    • Initial version of the comparison tables from “linear” Markdown lists with links
    • The extended primary use formulation
    • The narration of the flowchart
  • I did not just copy-and-pasted the those LLM generated fragments — I read then in full and edited them too!

Future plans

Both

  • Chatbooks can have magic specs (and corresponding cells) for:
    • DeepL
    • ProdGDT
  • A video with comprehensive (long) discussion of multi-cell chats.

Python

  • Documenting how LLM-generated images can be converted into image objects (and further manipulated image-wise.)

Raku

  • Make Python chatbooks re-runnable as Raku chatbooks.
    • This requires the parsing of Python-style magics.
  • Implement LLM function objects (functors) in Raku.
    • In conjunction of the anonymous functions implementation.
      • Which one is used is specified with an option.
  • Figure out how to warn users for “almost right, yet wrong” chat cell magic specs.
  • Implement copy-to-clipboard for Linux and Windows.
    • I have put rudimentary code for that, but actual implementation and testing for Linux and Windows are needed.

References

Articles

[SW1] Stephen Wolfram, “The New World of LLM Functions: Integrating LLM Technology into the Wolfram Language”, (2023), Stephen Wolfram Writings.

[SW2] Stephen Wolfram, “Introducing Chat Notebooks: Integrating LLMs into the Notebook Paradigm”, (2023), Stephen Wolfram Writings.

Notebooks

[AAn1p6] Anton Antonov, “Workflows with LLM functions (in Raku)”, (2023), community.wolfram.com.

[AAn1wl] Anton Antonov, “Workflows with LLM functions (in WL)”, (2023), community.wolfram.com.

[AAn1py] Anton Antonov, “Workflows with LLM functions (in Python)”, (2023), community.wolfram.com.

Python packages

[AAp1py] Anton Antonov, LLMFunctions Python package, (2023), PyPI.org/antononcube.

[AAp2py] Anton Antonov, LLMPrompts Python package, (2023), PyPI.org/antononcube.

[AAp3py] Anton Antonov, DataTypeSystem Python package, (2023), PyPI.org/antononcube.

[AAp4py] Anton Antonov, JupyterChatbook Python package, (2023), PyPI.org/antononcube.

Raku packages

[AAp1p6] Anton Antonov, LLM::Functions Raku package, (2023), raku.land/antononcube.

[AAp2p6] Anton Antonov, LLMPrompts Raku package, (2023), raku.land/antononcube.

[AAp3p6] Anton Antonov, Data::TypeSystem Raku package, (2023), raku.land/antononcube.

[AAp4p6] Anton Antonov, Jupyter::Chatbook Raku package, (2023), raku.land/antononcube.

[AAp5p6] Anton Antonov, ML::FindTextualAnswer Raku package, (2023), raku.land/antononcube.

Wolfram Language paclets

[WRIp1] Wolfram Research Inc., LLMFunctions paclet, (2023) Wolfram Paclet Repository.

[WRIr1] Wolfram Research Inc., Wolfram Prompt Repository.

[AAp4wl] Anton Antonov, NLPTemplateEngine paclet, (2023) Wolfram Paclet Repository.

Videos

[AAv1] Anton Antonov, “Jupyter Chatbook LLM cells demo (Raku)”, (2023), YouTube/@AAA4Prediction.

[AAv2] Anton Antonov, “Jupyter Chatbook multi-cell LLM chats demo (Raku)”, (2023), YouTube/@AAA4Prediction.

[AAv3] Anton Antonov, “Jupyter Chatbook LLM cells demo (Python)”, (2023), YouTube/@AAA4Prediction.

[AAv4] Anton Antonov, “Jupyter Chatbook multi cell LLM chats teaser (Python)”, (2023), YouTube/@AAA4Prediction.

[AAv5] Anton Antonov, “Simplified Machine Learning Workflows Overview (Raku-centric), (2023), YouTube/@AAA4Prediction.

Proc::ZMQed

This blog post proclaims and describes the package, “Proc::ZMQed”, which provides external evaluators (Julia, Mathematica, Python, R, etc.) via ZeroMQ (ZMQ).

Functionality-wise, a closely related Raku package is “Text::CodeProcessing”, [AAp1]. For example, Raku can be used in Mathematica notebooks with [AAp1] and [AAp2]; see [AA1] for more details. With this package, “Proc::ZMQed”, we can use Mathematica in Raku sessions.

See the presentation “Using Wolfram Engine in Raku sessions” for a concrete application of the package:



Installation

From GitHub:

 zef install https://github.com/antononcube/Raku-Proc-ZMQed.git 

From Zef ecosystem:

 zef install Proc::ZMQed 

Usage example: symbolic computation with Mathematica

Mathematica is also known as Wolfram Language (WL).

The following example shows:

  • Establishing connection to Wolfram Engine (which is free for developers.)
  • Sending a formula for symbolic algebraic expansion.
  • Getting the symbolic result and evaluating it as a Raku expression.
 use Proc::ZMQed; # Make object my Proc::ZMQed::Mathematica $wlProc .= new(url => 'tcp://127.0.0.1', port => '5550'); # Start the process (i.e. Wolfram Engine) $wlProc.start-proc; my $cmd = 'Expand[(x+y)^4]'; my $wlRes = $wlProc.evaluate($cmd); say "Sent : $cmd"; say "Got :\n $wlRes"; # Send computation to Wolfram Engine # and get the result in Fortran form. say '-' x 120; $cmd = 'FortranForm[Expand[($x+$y)^4]]'; $wlRes = $wlProc.evaluate($cmd); say "Sent : $cmd"; say "Got : $wlRes"; # Replace symbolic variables with concrete values my $x = 5; my $y = 3; use MONKEY-SEE-NO-EVAL; say 'EVAL($wlRes) : ', EVAL($wlRes); # Terminate process $wlProc.terminate; 
 # Sent : Expand[(x+y)^4] # Got : # 4 3 2 2 3 4 # x + 4 x y + 6 x y + 4 x y + y # ------------------------------------------------------------------------------------------------------------------------ # Sent : FortranForm[Expand[($x+$y)^4]] # Got : $x**4 + 4*$x**3*$y + 6*$x**2*$y**2 + 4*$x*$y**3 + $y**4 # EVAL($wlRes) : 4096 

Remark: Mathematica can have variables that start with $, which is handy if we want to treat WE results as Raku expressions.

Here is a corresponding flowchart:


Setup

In this section we outline setup for different programming languages as “servers.”

Generally, there are two main elements to figure out:

  • What is the concrete Command Line Interface (CLI) name to use?
    • And related code option. E.g. julia -e or wolframscript -code.
  • Is ZMQ installed on the server system?

The CLI names can be specified with the option cli-name. The code options can be specified with code-option.

Julia

In order to setup ZMQ computations with Julia start Julia and execute the commands:

 using Pkg Pkg.add("ZMQ") Pkg.add("JSON") 

(Also, see the instructions at “Configure Julia for ExternalEvaluate”.)

By default “Proc::ZMQed::Julia” uses the CLI name julia. Here is an alternative setup:

 my Proc::ZMQed::Julia $juliaProc .= new(url => 'tcp://127.0.0.1', port => '5560', cli-name => '/Applications/Julia-1.8.app/Contents/Resources/julia/bin/julia'); 

Mathematica

Install Wolfram Engine (WE). (As it was mentioned above, WE is free for developers. WE has ZMQ functionalities “out of the box.”)

Make sure wolframscript is installed. (This is the CLI name used with “Proc::ZMQed::Mathematica”.)

Python

Install the ZMQ library “PyZMQ”. For example, with:

 python -m pip install --user pyzmq 

By default “Proc::ZMQed::Python” uses the CLI name python. Here we connect to a Python virtual environment (made and used with miniforge):

 my Proc::ZMQed::Python $pythonProc .= new(url => 'tcp://127.0.0.1', port => '5554', cli-name => $*HOME ~ '/miniforge3/envs/SciPyCentric/bin/python'); 

Implementation details

The package architecture is Object-Oriented Programming (OOP) based and it is a combination of the OOP design patterns Builder, Template Method, and Strategy.

The package has a general role “Proc::ZMQed::Abstraction” that plays Abstract class in Template method. The concrete programming language of the classes provide concrete operations for:

  • ZMQ-server side code
  • Processing of setup code lines

Here is the corresponding UML diagram:

 use UML::Translators; to-uml-spec(<Proc::ZMQed::Abstraction Proc::ZMQed::Julia Proc::ZMQed::Mathematica Proc::ZMQed::Python Proc::ZMQed::R Proc::ZMQed::Raku>, format=>'mermaid'); 

(Originally, “Proc::ZMQed::Abstraction” was named “Proc::ZMQish”, but the former seems a better fit for the role.)

The ZMQ connections are simple REP/REQ. It is envisioned that more complicated ZMQ patterns can be implemented in subclasses. I have to say though, that my attempts to implement “Lazy Pirate” were very unsuccessful because of the half-implemented (or missing) polling functionalities in [ASp1]. (See the comments here.)


References

Articles

[AA1] Anton Antonov, “Connecting Mathematica and Raku”, (2021), RakuForPrediction at WordPress.

Packages

[AAp1] Anton Antonov Text::CodeProcessing Raku package, (2021-2022), GitHub/antononcube.

[AAp2] Anton Antonov, RakuMode Mathematica package, (2020-2021), ConversationalAgents at GitHub/antononcube.

[ASp1] Arne Skjærholt, Net::ZMQ, (2017), GitHub/arnsholt.

Videos

[AAv1] Anton Antonov, “Using Wolfram Engine in Raku sessions”, (2022), Anton Antonov’s channel at YouTube.