‘Any sufficiently advanced technology is indistinguishable from magic.’ Arthur C. Clarke
In the SBC Neural Network post we saw a 1k weights network trained with 10k samples to approximate the sine function. In this post we will use a 175G weights trained with 450G samples that can program better than the average programmer. The size of these models are impressive, but actually nobody knows really how they work or what are their limitations.
GitHub Copilot is an AI tool that speeds up software development, allowing the programmer to do many things that were previously impossible. At first, it seems similar to use StackOverflow, a website where programmers send questions when they don’t know how to do something, but Copilot goes much further, it is able to synthesize a new answer for our problem.
Copilot is included in a Microsoft Visual Studio Code and continuously suggests codes in grey that you can accept by pressing the tab button. This suggested code can be roughly defined the “most common” match between your query (your code) and the training dataset (GitHub code).
In this example, we define our function and its docstring and ask Copilot for completion. As we see, the completion corresponds with the docstring. The first intuition is that Copilot acts as a search engine and simply matches your query with its training dataset (150GB of open source projects), but this is not how it works.
Here we create some random/crazy string that can’t be in the training set. The result still looks like is the most coherent solution that can be provided, in this case: the sum of the input parameters.
In this example, we ask (in Spanish) to sum the area of intersection of two circles given its center and radius. Copilot understands the Spanish text without problems and suggests the function name, the parameters and all the function body. After a short review it looks like the code should work.
Now we create a hypothetical question/answer text. This makes Copilot match the query to some exams that can be in this training dataset. We simply ask for the capital of Spain, and Copilot generates the correct answer.
However, if we ask about a non existent country, Copilot also gives its best answer that looks “correct” too.
In this example we reverse the process, we give the answer to try to generate the question. Copilot generates a question that we did not expect. We expected ‘What is the capital of France?’ and Copilot asked ‘What is the result of the following code?’ but still we can understand a correct suggestion.
Here we force Copilot to ask about what we want changing to a more common language and adding first letter. However, it generates another question, this time completely wrong and has no relation with the answer.
In summary, Copilot:
- Builds a suggestion based on the most common solution.
- Is usually correct simply if your query makes sense.
- Is usually wrong when your query looks like a common problem but it is not, and it actually has a very different objective.
Copilot using open source libraries
Copilot was trained with open source projects. It includes millions of use cases of any open source library like numpy, opencv, qt… This makes Copilot really useful because it helps the programmer with the most common suggestion that is usually the best.
In this example, we use the unittest python module, and Copilot knows that the unittest.TestCase has a method named assertEqual and also knows that foo( 1, 2 ) should be 3.
Above we create a more complex foo function (that we assume it can’t be in the training data), to see if Copilot really understands the code. After running the code with 17 test cases, only 6 failed giving a 65% of success ratio.
It may not seem like much, but keep in mind that Copilot is not a python interpreter, it has not executed the function to get its output… Copilot has used what learned during training to convert our query in the output that has perfect python syntax and also works well 65% of the time.
One might expect that a long input text will cause Copilot to fail, but it doesn’t, the more information we give, the better answers copilot can generate.
In the example above we ask for a complex task, a complete program that requires different understandings to solve, like: python programming skills, micropython specific libraries and how to use them correctly even in human text description understanding.
The full hint is displayed in the next cell. Note that it matches the program description very well. The application class makes sense, and even the micropython libraries (PIN, UART, ADC, PIN.irq…) are correctly used. It is not 100% perfect, for example in this case temp_senor is an ADC object, which doesn’t have temp_sensor.init() method, and other small errors can be generated, but the whole structure of the program is definitely correct and the small errors can be fixed easily.
Finally, in the example below we use Copilot to add the comments to the previous code. We copy the class twice and add some guide to Copilot like “Docstring version of the above class”. Copilot generates the class and comments for each line.