OpenAI

Paying homage to AlphaGo, we've launched our own AI Go project at ToshiStats!

Reinforcement learning has become a hot topic since the release of OpenAI's o1-preview. Looking back, it was Google DeepMind's AlphaGo, released in March 2016, that truly brought reinforcement learning into the public eye. Go, with its vast search space, was traditionally a formidable challenge for computers. Amateur high-dan levels were roughly the limit at the time. However, AlphaGo, combining reinforcement learning and Monte Carlo Tree Search (MCTS), exceeded expert expectations, becoming the first AI Go player to defeat a top professional. Inspired by this, we've launched our own AI Go project, "ToshiStats-Go project," to research reinforcement learning. We're excited to see what we can achieve.

 

1. Creating a Go Game Environment

We've decided to build our own Go game environment from scratch. Given the exceptional coding capabilities of o1-preview, we're using it as a coding assistant for this project. We're iteratively developing the code by requesting o1-preview to generate the Go game environment code, executing it in Google Colab, then requesting further refinements based on the results, and repeating the process. Within a few iterations, we were able to establish a basic framework and a functional environment. While we can't perfectly implement a complex game like Go, we've created something akin to "simple-go." This should be sufficient for implementing reinforcement learning and improving its accuracy. Below is an example of o1-preview's explanation of a code modification. As you can see, it's quite detailed.

                                                      o1-preview's explanation of code modification

 

2. Trying a Game of Go

Let's give it a try! The current AI model plays random moves, so it's not very strong. As shown in the example below, a human can win with careful play. While a 9x9 board is available, the calculations can be time-consuming, so we'll stick with a 5x5 board for now. It's enjoyable enough, and if you'd like to try it yourself, please download the Colab notebook from our Github repository (1). A GPU is not required.

                                                                     Trial run of ToshiStats-Go

 

3. Perfect Go Rules Are Difficult

Go has some very complex rules. In particular, determining the life and death of stones, especially in the endgame, proved challenging. Implementing "ko" and "seki" also seems difficult. Connecting to an external Go system might solve these issues, but for now, we'll continue with a lightweight environment that completes calculations within the notebook to facilitate reinforcement learning experimentation. We'll strive to make this series engaging and easy to follow, comparing our progress with simpler games like Gomoku or connect five. We appreciate your continued interest.

 

So, there you have it! We've successfully implemented a Go playing environment in Colab. From here, we'll dive into reinforcement learning and begin training our AI Go player. Stay tuned!





 
 

1) ToshiStatsGo-project

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

The combination of Monte Carlo Tree Search (MCTS) and generative AI could be a real game-changer in the future!

"Monte Carlo Tree Search," a search technique, gained fame in March 2016 when AlphaGo became the first AI to defeat a top professional Go player. Its effectiveness increases significantly when combined with reinforcement learning, making it a powerful tool. However, implementing it can be quite challenging. With the recent release of ChatGPT canvas (1) on October 3rd, I want to explore implementing Monte Carlo Tree Search in a simple game. Let's begin!

 

1. AlphaGo and Monte Carlo Tree Search
AlphaGo, which decisively defeated 18-time Go world champion Lee Sedol in March 2016, owed its strength to the combination of reinforcement learning and Monte Carlo Tree Search (MCTS), as discussed previously. A research paper (2) illustrates the performance comparison of various Go AI programs.

Performance comparison of various Go AI programs.

The leftmost "Raw network" doesn't utilize MCTS during inference, resulting in lower performance compared to AlphaGo Zero next to it. This highlights the significant contribution of MCTS. In AlphaGo Zero, MCTS is executed as shown in the diagram below. The action probability 'p' is trained to approach the probability 'π' of the next move selected by MCTS, gradually improving accuracy. For details, please refer to (2).

AlphaGo Zero "Monte Carlo Tree Search"


2.Implementing MCTS in a simple game
Witnessing MCTS's success in AlphaGo makes you want to try it out yourself. The recent release of ChatGPT canvas (1) from OpenAI provides the perfect opportunity. As their message "A new way of working with ChatGPT to write and code" suggests, it offers a new user experience. I promptly asked ChatGPT canvas, "Could you make code of Tic Tac Toe by using python and MCTS?"

Unlike regular ChatGPT, a separate window opens and generates Python code as shown below.

I also wanted an explanation, so I added a prompt to provide it in English. Since the generated code cannot be executed within the canvas, I copied and pasted it into Google Colab to run it.

I was able to enjoy the game as shown below. Fantastic!

The generative AI model GPT-4o, powering ChatGPT canvas, appears to have improved coding abilities, likely due to post-training with data distilled from the recently released, logically robust o1 preview. While I encountered occasional errors, copying and pasting them into a prompt for correction quickly resolved the issues. It felt like a significant upgrade to a full-fledged code assistant. I'm eager to use it more. The generated code can be found at (3).

 

3.Promising combination of Generative AI and MCTS
Research on incorporating the AlphaGo mechanism into generative AI is actively underway. Versions after AlphaGo Zero, released in 2017, don't require any human input (in this case, game records). This freedom from data constraints makes it a promising technology to address training data scarcity. The combination of reinforcement learning and MCTS offers flexible design possibilities, making it highly intriguing for developers. From the perspective of test-time computing, highlighted by OpenAI's o1-preview, it's a technology worth focusing on. In the next post, I plan to delve deeper into MCTS by examining published research papers. Stay tuned!

 

What do you think? The concept of MCTS is relatively simple, which broadens its applicability. It works well with ChatGPT canvas, and I'm excited to continue experimenting. Currently, it's available only to paid subscribers, but it's expected to be available to free users upon general release. I'm looking forward to it. That's all for today. Stay tuned!

 

1)Introducing canvas, OpenAI, Oct 3 2024
2)Mastering the game of Go without human knowledge,  David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert , Lucas Baker, Matthew Lai, Adrian Bolton, Yutian Chen, Timothy Lillicrap, Fan Hui, Laurent Sifre, George van den Driessche, Thore Graepel & Demis Hassabis,  GoogleDeepMind, Oct 19 2017, VOL 550, NATURE, 355
3)Monte-Carlo-Tree-Search-with-ChatGPT-canvas, Oct 6 2024

 

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

OpenAI's "o1-preview" Arrives: Is This the Next Leap Towards Artificial General Intelligence?!

On September 12, 2024, OpenAI released its new generative AI model "o1" (pronounced "oh-one"), which had been the subject of much speculation. I had the opportunity to try it out, and here are my initial impressions.

 

1. Model Overview

As a new generative AI model, o1 has various features, but the key points are as follows:

  • Specialized for scientific, coding, and mathematical reasoning.

  • Available in two versions: OpenAI o1 and OpenAI o1-mini.

  • Currently in preview with limited functionality and performance.

  • Not a successor to GPT-4.

  • OpenAI o1 has a limited usage of 30 requests per week.

  • Price: OpenAI o1 is about six times more expensive than GPT-4o.

For more details, please refer to the official website (1).

Compared to GPT-4o, o1-preview demonstrates superior performance in coding, data analysis, and mathematics, as shown below. It seems likely that o1 will excel in fields where existing generative AI has struggled to achieve satisfactory accuracy. However, because it utilizes Chain of Thought reasoning to arrive at answers, it can take a considerable amount of time to respond, making it unsuitable for tasks requiring real-time answers.

GPT-4o vs. o1-preview: Task Performance Comparison

 

2. Challenging o1 with Game24

Let's test the capabilities of o1-preview. A common example of a task that generative AI struggles with is Game24.

This is a simple mathematical puzzle with the following rules:

  • Use the four given numbers and basic arithmetic operations (addition, subtraction, multiplication, division).

  • Create a mathematical expression that results in 24.

  • Each of the four given numbers can be used only once.

Example: 13, 10, 9, 4 → (10 - 4) × (13 - 9)

When attempting this with o1-preview, it produced the following result. It successfully solved the puzzle! The response took about 15 seconds, likely due to internal trial-and-error processes.

Game24 instruction

o1-preview Game24 Trial Result

When trying the same with GPT-4o:

GPT4o Game24 Trial Result

GPT-4o fails to provide a correct answer. This highlights o1's superiority in tasks that require strong logical reasoning.

 

3. The Impact on the Future of Generative AI

o1's newfound capabilities are attributed to its incorporation of Chain of Thought reasoning, enabling it to generate task-specific chains of thought and produce more reliable correct answers. However, the Chain of Thought process, which demonstrates how the correct answer is derived, is not revealed to the user. This is somewhat disappointing, as users typically want to understand not only the correct answer but also "why" that answer was reached. Therefore, it's understandable that some may perceive it as a black box. We hope that the open-source development community will further research this aspect and share their findings with the world. With excellent open-source generative AI models like Llama and Gemma currently available, we believe that user verification of Chain of Thought will become possible in the near future.

 

Conclusion

o1-preview seems to have been received with a level of excitement not seen since the release of GPT-4 in March 2023. In the next installment, I plan to explore the technology behind this impressive generative AI, based on external speculation. That's all for today. Stay tuned!

 

1) Introducing OpenAI o1, OpenAI, Sep 12, 2024 

 

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

The era of "agent-style applications" has arrived, earlier than expected and seems to be accelerating even further

On November 6, the OpenAI DevDay was held, marking its first annual developer's conference. The technological developments since the debut of GPT-4 in March 2023 were introduced at once. There's too much to cover comprehensively, so I'll leave that to OpenAI CEO Sam Altman, but here I want to raise three key points I've considered and explore them further.




  1. Price is Key

The anticipated price reduction has been realized. GPT-4 is roughly about 65% off. Of course, the reduction varies depending on usage. I've already tried the new GPT-4 Turbo for half a day, and it cost about $5, which would have definitely exceeded $10 before. This makes it more viable for Proof of Concept (PoC) use. It seems the time has come to utilize GPT-4's still unseen potential in various areas. A wallet-friendly approach is a welcome change for everyone.



2. Building AI Apps Without Being a Programmer

At this developer's conference, I noticed many features that operate with no-code. GPTs, which allow creation of customized ChatGPT in a dialogue format, is a prime example. The developer-oriented Assistants API also doesn't require coding if used with the Playground. With the code interpreter tool already implemented, writing prompts to invoke and execute it automates the rest. This is impressive.

I implemented a model to calculate default probabilities using a step-by-step prompt, from 1 to 5, with the code-interpreter turned on, without writing any specific code. When executed, the model was successfully created, and it performed tasks like calculating AUC and generating histograms as instructed.





3. Easy Construction of "Agent-Style Applications"

Listening to OpenAI CEO Sam Altman's presentation, I felt a strong emphasis on agents. The Playground Tool includes function calling, which seems to make it much easier to create agents that determine their next actions based on situations. While open-source implementations of agents have been increasing, I didn't expect them to be implemented this quickly on the OpenAI platform. Paired with GPTs, the year of 2024 feels like it could be the first year of "agent-style applications." This is truly exciting.

How about these new services? Following the announcements at DevDay, developers worldwide seem to be thinking about various AI applications. I'm also eager to start creating an agent-style application. Stay tuned!




Copyright © 2023 Toshifumi Kuga. All right reserved

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

GPT-4V is here. I tried it immediately and was amazed. It can do this too!

Sorry to keep you waiting. OpenAI's GPT-4 now comes with image recognition capabilities. To be precise, it was demonstrated when it debuted in March of this year, but it has only now been made available to users after half a year. I recently tried the new feature in ChatGPT+ and, in a word, it's incredible!

By the way, the image mentioned above was also created with a combination of GPT-4 and DALL-E3.

Now, let's start the experiment!


First, we'll start with recognizing mobile-phones. It can accurately count the number of mobile-phones. This is a piece of cake.

 

I thought flight information would be challenging, but it identified the destination impeccably. Since it's originally an excellent language model, it seems proficient in deriving meaning from images.

 

It can even read Osaka's Tsutenkaku tower. Local information is no problem.

 

For a change, I inserted an image of analysis results. It can read graphs effortlessly. This is impressive!

 

What shocked me was that it could easily count cars. Of course, it's not a specialized object detection model, so errors will always occur. I believe there were about 48 cars in this photo, but for general use, this margin of error seems acceptable. It's astonishing what it can do by just being given an image.

 

It can count cans, but the error is relatively significant. It might struggle with cluttered items.

 

It works well to read English text in an OCR-like manner.

 

It can also easily read the time displayed on electronic signboards.

How did you find it? Without any fine-tuning, it achieved this much. GPT-4V has just been launched, and various use cases are likely to emerge in the future. I look forward to introducing interesting examples here as they arise. Stay tuned!

 

Copyright © 2023 Toshifumi Kuga. All right reserved

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

Fine-tuning GPT-3.5 with synthetic text generated by GPT-4. The accuracy has improved! In the future, we might not even need training text???

Hello, despite being in the latter half of September, it is still quite hot in Japan. The photos feel mismatched, but I'm deliberately sticking to the autumn theme, hoping it will get cooler soon. However, it might stay hot for the rest of the month.

Now, about the fine-tuning of ChatGPT-3.5 that I introduced the other day, it's certainly a hot topic. I think there is a strong demand in companies to specialize its performance for specific tasks. For this reason, we conducted an experiment assuming cases where you would want to proceed even without data at hand by generating synthetic text and then fine-tuning it.

 
  1. Experiment Details

Just like the previous experiment, we set a task to determine which financial product a given English-language complaint is about. They are complaints for the banking industry, so the task involves differentiating between six types of financial products such as mortgages and bank accounts. The data used for fine-tuning was minimal, with 100 samples for validation, just like last time. However, the training data is different this time. We generated customer complaint emails using GPT-4, and they are indistinguishable from real ones at a glance. GPT-4's performance is indeed impressive. We generated 15 similar customer complaints for training and then proceeded with fine-tuning.

synthetic text generated by GPT-4


2. Experiment Results

Since this was our first time using synthetic text, we were worried about the outcome, but we were able to confirm the effectiveness of fine-tuning as follows. Though the improvement isn't dramatic with just 15 samples, the accuracy for this task has improved compared to the base GPT-3.5, which had an accuracy of 0.5 to 0.55.

For more details on the experiment, please refer to this notebook.

 

3. Discussion

Fine-tuning with synthetic text was a method not even considered before, but with the arrival of GPT-4, it's becoming more realistic. There are several points to consider, such as the number of samples and how to write prompts, but the advantage of being able to start even without data is significant. Currently, GPT-4 is the only option for generation models, but it seems like new models like Gemini from Google will also be available next year. Technology is advancing rapidly, so we can expect a lot more in the future.

So, what did you think? We will continue to conduct various experiments and share our findings here. See you again soon!




Copyright © 2023 Toshifumi Kuga. All right reserved

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

Fine-tuning has come to ChatGPT. Its effects are outstanding, and if implemented according to the task, we can perhaps expect significant improvements in accuracy!!

Hello everyone, how are you doing? Although the illustration is autumn-like, it seems that summer will stick around for a while in Japan

While that was happening, I suddenly received a message from OpenAI saying, "The fine-tuning feature has been implemented." I have always fine-tuned open-source models, so I was a little disappointed that ChatGPT didn't have this feature. But it seems that it has finally made its appearance. I guess OpenAI got a little serious. Let's get started right away.

 
  1. Is fine-tuning effective for ChatGPT?

I'm sure you all want to know, "Does fine-tuning work well with ChatGPT?" So I created a small dataset and conducted a simple experiment. To put it briefly, "Something amazing is happening!" Below is the table with the results.

Accuracy for 100 samples

I had GPT3.5 perform a 6-class classification task and expected some fine-tuning effects. However, exceeding an accuracy of 0.8 was unexpected. The normal GPT3.5 only barely surpassed 0.5, so I initially thought that the model's potential was lacking. However, an accuracy of 0.88 appeared on the first fine-tuning, which was hard to believe. Upon changing the seed and refreshing the data, it still yielded an accuracy near 0.8, completely different from the normal accuracy. The compatibility between fine-tuning and ChatGPT must be outstanding.

 

2. Experiment Details

In this experiment, the task was to identify what type of financial product a given English complaint was about. This is a task of classifying 6 different financial products such as home loans or bank accounts, and the data used for fine-tuning consisted of 100 samples each for training and validation, which is a minimum configuration. The training results show a decrease in training loss and eventually seem to reach zero (actually it continues to go down further). Quick conclusion: it went well. Using this fine-tuned model yielded the results mentioned in section 1.

 

3. Discussion

Just by looking at the results of this experiment, we can't definitively say that fine-tuning always succeeds. Various cases will emerge in the future, and it will be important to make comprehensive judgments based on those results. Especially this time, minimal prompt engineering was done. Combining prompt engineering and fine-tuning to achieve the best performance is a future challenge. There are many points to consider, like cost and computation time. It will require trial and error. While GPT-4 indeed performs well with an accuracy around 0.8 for this task, its cost is high, and implementation isn't always straightforward. Even in such cases, the new weapon of fine-tuning has come into our hands, increasing our options and potentially moving us a step forward in problem-solving.

How was it? I would like to introduce more experiments and their results here in the future. Stay tuned!




Copyright © 2023 Toshifumi Kuga. All right reserved



Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.

“Function calling” is a game changer as GPT can access outside and be converted to our agents easily!

Today, I want to create web-site with a description of the Japanese sweets collection, just like “Dorayaki“ in the picture above. So I ordered my AI agent to create an awesome web-site. But is it really possible? I am sure yes, it is!. As you know, OpenAI created GPT, which is very intelligent large language model (LLM). On 13 June 2023, “Function calling” was introduced by OpenAI. It can bridge GPT to other systems, APIs and functions outside. Let me explain step by step!

 

1.What is the advantage of “Function calling”?

Function calling makes it easy for GPT to access functions outside. For example, when you want to create a web-site where Japanese sweets are explained to customers, you need to connect GPT to the function that can write code of web-site with HTML/CSS. With “Function calling”, GPT can call this function and pass the parameters, such as “explanations of Japanese sweets” to this function. Official documents says “The latest models (gpt-3.5-turbo-0613 and gpt-4-0613) have been fine-tuned to both detect when a function should to be called (depending on the input) and to respond with JSON that adheres to the function signature.”

 

2. The list of “functions” is key to set “function calling” up

“Function calling”looks great! But how can we implement in our code. I think it is so simple. Just prepare the list of functions. This should have

  • "name"

  • "description"

  • "parameters" : "type" , "properties", "required"

In ChatCompletion.create, we should add “functions=functions” because we want to call the function. The other part of the code has not changed so much. The code below shows us an example of functions, which comes from Official documents. Please look at these docs for the details if needed.

 

3. Let us see how the generated web looks like

OK, it is the time that we see the result from our agent. I instruct "Create a web-site for a pretty Japanese sweets collection" to our agent. Text of “title” and “explanation” are generated by GPT3.5-turbo and are sent to the function that creates a web. Here is the result. All are written in Japanese. The title means “a pretty Japanese sweets collection". The sentences of the explanation are pretty good! I do not think there is a need to fix or modify these sentences at all.

If you want to know more details with the code, you can see it here.

https://github.com/TOSHISTATS/Wagashi-Collection-Web-Generation-agent-by-GPT3.5#readme

 

Hope you can understand how AI agents work. I think potential use-cases of “Function calling”are limitless. I tried several use cases by “Function calling” and found that it can be a game changer to develop LLM application systems. I would like to update my article about AI agents by OpenAI GPT soon. Stay tuned!

 
 
 

Copyright ©  2023  Toshifumi Kuga.  All right reserved

Notice: ToshiStats Co., Ltd. and I do not accept any responsibility or liability for loss or damage occasioned to any person or property through using materials, instructions, methods, algorithms or ideas contained herein, or acting or refraining from acting as a result of such use. ToshiStats Co., Ltd. and I expressly disclaim all implied warranties, including merchantability or fitness for any particular purpose. There will be no duty on ToshiStats Co., Ltd. and me to correct any errors or defects in the codes and the software.