AI Projects

As far as AI is concerned, I'm mostly interested in their creative applications. For example: game-playing AI, metahuristic optimizaton, using pre-trained models for innovative purposes, etc. I haven't explored much of main-stream AI/ML.

Chess AI

I used minimax algorithm to implement chess AI in Python, using PyGame for graphics. Minimax algorithm requires an evaluation function, which takes in the state of the board and outputs a score that conveys which player is winning and by what extent. I implemented a simple evaluation function which considers just positional and material advantages. Minimax algorithm assumes that each player will try to maximize their own evaluation, and outputs the best possible move by looking ahead a certain number of steps. I incorporated multiprocessing after the first move to speed up the process. Currently, only 3-4 moves can be looked ahead, although this number can be increased via optimizations like alpha-beta pruning. You can find the repository here.

Genetic Algorithm

What if we can use the principles of evolution to solve optimization problems? Genetic algorithm is a metaheuristic optimization method which uses mutation and recombination in order to maximize fitness of a population of genes towards a given problem. I considered a very abstract problem: guiding a bunch of blind entities from point A to point B, whilst avoiding collision with walls and obstacles. You can access the simulation here.


Image Segmentation and Optical Character Recognition

I implemented an entire pipeline which takes in an image and infers all the text present in it. This process is composed of several steps:

    Image preprocessing: A color image or a grayscale image needs to be converted into a binary image. This is acheived via adaptive thresholding. Any skew/rotation in the image is fixed maximizing the intra-class variance of the row histogram.
    Image segmentation: All paragraphs in the image need to be broken down into lines, all lines into words, and all words into letters. This is again achieved by histogramming. This model cannot deal with cursive letters yet. The main aim was to read printed-text, which was easily achieved with high accuracy.
    Character preprocessing: This just involves resizing the character into a square frame with adequate margin space, followed by skeletonization if necessary.
    Character recognition: I trained 4 models: a DNN model and a CNN model on hand-written text and printed-text. The character is fed into the trained neural network, and all the probabilities are recorded. The character with the highest probability is initially chosen.
    Post-processing: This involves doing spell-correction, replacing numbers with letter when numbers don't make sense, etc. Integration an NLP model here would be useful. Post-processing also involves computing a metric called Levenstein distance, which is later used to calculate the accuracy of the entire pipeline.
As is expected, the CNN model trained on printed characters had highest accuracy. You can find the repository here.

Game Dev + Computer Vision

I initially made a simplified, procedural, obstacle avoidance game called Infinity Run. I used a method called ray-casting in order to figure out how close the walls are on all angles, and accordingly render the height of the walls. At this point, I realised that I could integrate a pre-trained model for hand-detection, and thereby allow the player to control the character (go left/go right) by giving input into the webcam! For this purpose, I used a model from mediapipe (Google). The framerate is a bit low because the inference takes time, and my laptop isn't top-notch. You can find the repository here. You can watch a video of me playing the game here.