Overview

Word Hunt/Boggle are word games where players find words in a grid of letters by connecting adjacent tiles, without reusing any tile in the same word. As a fan of Boggle, I became intrigued by the challenge of creating the optimal board, one that theoretically yields the highest possible score.

Scoring Potential

In typical gameplay, players have about one minute to find words and accumulate points, often maxing out around 30,000 points on a good board. However, significantly higher scores are theoretically possible.

Efficiently calculating these scores is rather challenging. Looking at all 300,000 words each iteration is quite slow. Instead, I build a trie out of the input words, and then recursively traverse the trie, while efficiently keeping track of visited tiles. This is then easily parallelized to search from all of the starting tiles simultaneously

Best Boggle Board — The highest scoring 4x4 Word Hunt board

Optimization

To determine the optimal Boggle board, I implemented a genetic algorithm for global search with hill climbing for local optimization. My initial approach used Simulated Annealing, but often times this still was failing to escape local minima. As a result, this Genetic Algorithm approach was chosen instead. This approach efficiently discovers high-scoring 4x4 boards and can approximate optimal configurations for other board sizes and shapes.

Hill Climb Animation — Hill Climbing approach on a random board

How It Was Made

The project was originally built from scratch in Python. The core searching features were all written in core Python. Matplotlib and Imageio were used for creating visualizations.

More recently, the code has been rewritten in C++ providing about a ~15x speed improvement. This was done to allow searching for larger boards, and with the ultimate goal of finding the smallest board that contains every word in the scrabble dictionary.

Overview

With a friend, we built out a database system for the model problem of managing an airport. This involved first formally designing an appropriate data model, planning out what tables and permissions should be involved, generating realistic data for demonstrations, implementing the database and all the query logic, and finally wrapping it in a clean UI.

Views

The project has three different views that can be logged into: administration, crew members, and passengers. Each view has a complete user interface for viewing the allowed information, such as upcoming flights as well as modifying the allowed information.

For example, a passenger can see info about their own upcoming flights, as well as all flights that will be going into and out of the airports. They can then book flights, cancel flights, swap flights, or modify information about a flight such as the number of bags or their seat (if other seats are available). On the other hand a crew member can view their own schedules including information about what other crew are on the flights they are working, as well as what seats are full on those flights.

All of the permissions are handled through SQL views to ensure no unintended data is available.

Utility

Each view is fully functioning with the expected utility. The data model stores airlines, planes, terminals, destinations, flights, crew members, passengers, bags, and booking info. New users, passengers, and flights can be dynamically created and edited via the admin UI. Passengers may create and modify bookings, as well as their baggage info.

To aid in usability, dynamic searching is done via dropdown menus that create dynamic SQL queries. This makes finding specific information like flights on a specific day simple. Furthermore, searches can be downloaded into an easily readable text format for logging and further analysis.

How It Was Made

To build the final project, we used PostgreSQL via mariadb to handle all of the data. For the UI, we used PySide6 with components to implement a simple and easy to navigate design.

Overview

This project explores the challenge of predicting the US state of origin for Google Maps images. Using advanced machine learning techniques, we developed a model capable of identifying the state based solely on visual cues from satellite imagery. This task presents unique challenges due to the subtle differences between states and the variety of landscapes within each state.

Model Architecture

After extensive experimentation, we found that a Vision Transformer (ViT) outperformed traditional convolutional neural networks for this task. The ViT's ability to capture long-range dependencies in images proved crucial for identifying state-specific features across diverse landscapes.

We experimented with multiple training approaches, including classification (predicting the state directly), regression (predicting latitude and longitude), and hybrid models that combined both tasks. To handle partial data where exact coordinates weren't available, we implemented custom loss functions that could train effectively on mixed data types.

However, the most significant way to improve performance that we discovered was to add many augmentations. From a theoretical standpoint, every part of the image could be valuable to making the prediction. However, the model would generally only focus on a specific influence. To prevent this, augmentations including but not limited to: changing image angles, adding blur or color change effects, and blocking out significant portions of the image. Otherwise, the model did not learn to use the entire image resulting in significantly worse overall results, and notably mroe overfitting.

Geolocation Augmentations — Examples of data augmentation techniques used during training

Geolocation Predictions — Example inputs and model predictions

Performance and Insights

Our model achieved an impressive accuracy of over 60% in correctly identifying the state from a single image. When allowed to provide its top 3 predictions, the accuracy increased to over 80%. This performance is particularly noteworthy given the challenging nature of the task and the visual similarities between many states.

Interestingly, attempts to visualize the model's decision-making process using techniques like GradCAM did not reveal easily interpretable patterns. This suggests that the model learns to consider the entire image holistically rather than focusing on specific landmarks or features.

State F1 Scores — F1 scores for each US state

Dataset and Training

The model was trained on a dataset of 10,000 Google Maps images, with samples from various locations across each state. Remarkably, we found that the model performed well even when trained on just 500 examples per state, demonstrating its ability to generalize from limited data.

We employed a range of data augmentation techniques to enhance the model's robustness and prevent overfitting. These included random rotations, flips, and color adjustments to simulate variations in lighting and seasonal changes.

Top-k Accuracy — Model accuracy for different values of k in top-k predictions

Technical Implementation

The project was built using the FastAI library, which provided a flexible framework for implementing custom model architectures and loss functions. We extended FastAI's capabilities by developing specialized heads for our multi-task learning approach and implementing custom loss functions to handle partial data.

The Vision Transformer architecture was chosen for its superior performance in our initial experiments. We fine-tuned pre-trained ViT models, adapting them to our specific geolocation task through careful hyperparameter optimization and custom training routines.

Overview

We investigated a state-of-the-art defense against adversarial examples in NLP classifiers. Our research revealed an easy method to defeat the defense mechanism and demonstrated a general problem with NLP defenses, and concluded by working towards better generalized methods for evaluation.

Background

While AI offers many potential benefits, one risk is our incomplete understanding of how it works. This has led to the discovery that small modifications to inputs can "trick" classifiers. Initially studied with images, this issue has extended to text classification and response systems used for tasks like spam classification and sentiment analysis.

For text classification, these "attacks" work in a broad sense by changing words in an input until they find a change that seems to confuse the model. They then repeat this until the model is sufficiently confused it makes the wrong prediction. Some examples of this can be found near the bottom of this page.

Defense Mechanism Analysis

The published defense, aimed to determine whether an example was real or adversarial (tricking the model), by analyzing the distribution of the logits of the classifier. Upon further investigation, our research demonstrated that in practice this mostly functioned as a binary classifier based on model confidence (gaining less than 5% performance over this).

As a result, the image on the left shows that the distribution of confidence in adversarial predictions vs clean predictions is nearly exactly dependent on confidence level.

confidence_scores — Confidence scores distribution for clean and adversarial examples

Defeating the Defense

Using our understanding, we demonstrated how to trick the defense 99.4% of the time without altering how a human would interpret the sentence, simply by swapping words. Our methods could trick the defense into misclassifying both real and adversarial examples.

Adversarial Defense Confidence Dependence
Attack Confidence Threshold	Defense Success Rate (%)
0.5	93.0
0.8	73.6
0.97	0.05
0.997	0.00

Example of NLP attacks + Defense

These examples demonstrate how the attack and defense work. Given an initial sample, words are changed to effect the model prediction. Here, we take the same sentence and change it in different ways. For each scenario we report the sentiment classification of the original model, as well as the prediction of the defense about whether the example was real or an adversarial example (AE).

When the original model has a low confidence, the defense will assume it is an adversarial example, and when the confidence is high, it will assume it is valid. This is regardless of whether or not it was actually made adversarial

Adversarial Defense Confidence Dependence
Example Type	Example Text	Model Prediction	Model Confidence	Defense Prediction
Original Example	I went there today! The cut was [[terrible]]! I have an awful experience. They lady that cut my hair was [[nice]] but she wanted to leave early so she made a disaster in my head!	negative	100	Not Adversarial
Low Confidence AE	I went there today! The cut was [[monstrous]]! I have an awful experience. They lady that cut my hair was [[fantastic]] but she wanted to leave early so she made a disaster in my head!	positive	61	Adversarial
High Confidence AE	I went there today! The cut was [[monstrous]]! I have an awful experience. They lady that cut my hair was [[marvelous]] but she wanted to leave early so she made a disaster in my head!	positive	90	Not Adversarial
Close Boundary non-AE	I went there today! The cut was [[terrible]]! I have an awful experience. They lady that cut my hair was [[fantastic]] but she wanted to leave early so she made a disaster in my head!	negative	60	Adversarial

Implications for Future Research

As it turns out, many different adversarial example defenses end up functioning the same. They are some complex model that effectively learns to classify based on confidence. Of course, this is problematic because a knowledgable attacker can simply require there attacked sample to have a confidence of 90% instead of 50%, and immediately defeat the model. In our testing, we even showed this is often not that much harder!

To address this, we recommended further defemse research use two different types of attacks. First, they should use adaptive attacks that aim to simultaneously fool the model and prevent defense prediction from increasing. Second, they should use high confidence attacks that are normal attacks allowed to run until a high confidence.

Overview

As part of my research with Dr. Tina Lasisi, I worked on updating the Fibermorph project which is a project that is used for determining the length and curvature of human hairs given microscopic images of them. There were three main goals with improving the analysis: improve the ease of use, improve the accuracy, and improve the speed. Addressing these required coming up with new algorithmic approaches, as well as figuring out methods for generating realistic testing data. This discusses some, but not all of what went into the project.

Algorithmic Problem

The overall problem is to take a noisy image from a microscope, that includes some approximately circular arcs, and identify the arcs and calculate a best fit. There are a few challenges with this specific task. The arcs often overlap, they can occasionally have a bend (which should be treated as two separate arcs), they're blurry, and there can be small gaps.

Because the problem was challenging, a few approaches were evaluated. The first idea was to use a variant of the Hough transform, modified for circular arcs, much like is done in this paper. However, this struggled if the arcs ended up being more elliptical in practice. The second method was to skeletonize each line, and then experiment with all feasible endpoint combinations and attempt classical curve fitting. This approach requires many more heuristics, but proved better in the worst cases.

Algorithm Evaluation

Part of the challenge with this project was the difficulty in evaluating new approaches. In addition to improving the UI for checking analyses, a priority was placed on generating realistic data with known "correct" results. For this, a UI was built out that supports custom generating images with different levels of noise, blur, color variance, scratches, and types of arcs. For proper testing, we need to control whether or not arcs have "bends", how circular/elliptical they are, the clarity and thickness of the arcs, and the level of overlap.

How It Was Made

The project was built using python with Numpy to support efficient vectorized operations. Some operations from scikit-image were also used to improve image quality. Tkinter was used to create all of the UI components.

Overview

Kattis is a popular platform for competitive programming, but it lacks a feature to replay past contests. This project addresses that gap by creating a website that combines multiple Kattis contests, allowing users to compare contests or practice alongside a replayed scoreboard as if they were part of the original event.

The Challenge

While Kattis excels in hosting competitive programming contests, the inability to replay past events poses a significant limitation for practice and analysis. This feature is particularly crucial for schools and competitive programmers looking to improve their skills by simulating real contest environments.

Our school faced this challenge multiple times last year, highlighting the need for a solution that would allow students to experience the dynamic nature of a live scoreboard while practicing with past contest problems.

The Solution

To address this need, I developed a Sveltekit app that serves as a comprehensive contest replay platform. The solution consists of two main components:

Backend: A system that makes requests to the appropriate Kattis contest webpages, scrapes the incoming data, and formats it into a JSON structure for easy processing.
Frontend: A user interface built with the help of Shadcn-Svelte, providing an intuitive and visually appealing way to view contest standings and navigate through the replay.

Key Features

Ability to combine and compare multiple Kattis contests
Real-time scoreboard evolution simulation
User-friendly interface for easy navigation and analysis
Seamless integration with existing Kattis contest data

Overview

The Maximum Leaf Spanning Tree problem challenges us to find a subset of edges in a given graph that forms a spanning tree with the maximum number of leaf nodes. As an NP-hard problem, it requires innovative approaches beyond traditional algorithms. Our project, developed as part of Penn State's Graduate Algorithm course, introduces novel heuristics and transformations that outperform existing methods in both effectiveness and optimality proofs.

The Challenge

With approximately 40 students tasked to implement the best approach possible, common solutions relied on optimizing existing approximation algorithms with worst-case guarantees. However, these methods often fell short in practical applications, performing relatively poorly compared to optimal solutions.

Our challenge was twofold: develop a more effective heuristic approach and find a way to prove optimality for as many instances as possible, something no other team had achieved.

Our Solution

We developed two key approaches to tackle this problem:

Novel Heuristic Approach: We created a heuristic that repeatedly chooses edges based on their "centrality", quantified by the distance to all remaining untouched vertices. Extensive testing showed this method significantly outperformed existing approximation approaches.
Partial Reduction to 3-SAT: We developed a novel (partial) reduction of the problem to a version of the 3-SAT problem. This allowed us to employ state-of-the-art 3-SAT solvers, enabling us to find optimal solutions and provide proofs of optimality for about 90% of the problem instances.

Generating Challenging Test Cases

As a final part of the project, we developed a method to generate hard test cases:

Generate random trees
Strategically add edges to "obfuscate" the original tree without increasing the potential maximum leaf spanning tree
Test difficulty using our heuristics and knowledge of the optimal solution

This method resulted in test cases demonstrating the largest average gap from optimal among all submitted cases, effectively challenging other teams' solutions.

Results and Impact

The combination of our heuristic approach and 3-SAT reduction method yielded impressive results:

Solved all but 3 of over 120 test cases at least as well as any other team
Our heuristic approach alone outperformed the next best method
Provided proofs of optimality for ~90% of instances, a unique achievement in the class

Overview

This project implements an advanced Expression Tree parser in pure Python, featuring a sophisticated pattern matching system. It's capable of performing complex mathematical operations including derivatives, algebraic simplifications, and variable substitutions. As a practical application, the system can handle first-year physics with calculus equations, solving for unknown variables, combining and solving systems of equations, and converting between different units.

The Challenge

Traditional calculator projects in introductory Data Structures courses often lack the ability to perform algebraic manipulations or handle complex transformations. As a teaching assistant for such a course, I aimed to create a more challenging and less publicly accessible project that could integrate concepts from concurrent Calculus courses.

While the initial goal was to incorporate derivative calculations, the project's complexity exceeded the scope of the introductory course. This led to the expansion of the project into a more comprehensive and flexible system.

The Solution

The core of this project is a generic pattern matching system that allows for versatile manipulation of mathematical expressions. Key features include:

Expression Tree parsing for arbitrary complex mathematical operations
Pattern matching system for flexible equation manipulation
Ability to perform derivatives, algebraic simplifications, and variable substitutions
Application to physics equations, including solving for unknowns and unit conversions

Implementation and Future Potential

The approach used in this project is similar to how Abstract Syntax Trees are parsed in many programming languages. It also shares fundamental principles with advanced Computer Algebra Systems (CAS) like Wolfram-Alpha or Sage Math. The main difference lies in the implementation of clever heuristics for Expression Tree manipulation, which are crucial for efficient equation handling in full-fledged CAS systems.

While the current implementation relies on brute-force methods for manipulating equations, which can be slow for complex expressions, it provides a solid foundation for further development. With the addition of more sophisticated heuristics, this system could potentially evolve into a powerful tool for mathematical and physical computations.

Overview

Beyond the featured projects, I've worked on a variety of other interesting and challenging projects across different domains. While these projects may not have dedicated pages due to various reasons (including having a hard drive crash on me, similarity to other projects, or having created them for university course assignments), they represent a diverse range of my skills and interests in computer science and artificial intelligence.

Games AI

Genetic Algorithm for evolving 2048 heuristics used in a expectimax search, achieving a 95% win rate and able to provide human player feedback
Reinforcement learning for a Tetris + Snake" hybrid game
Alpha-beta search with PyTkinter visualization for Reversi
Simulation analysis of many different games, most notably for Dominion

Research-Oriented AI

In depth research presentations on AI security papers
Multiple different adversarial example attacks on MNIST images
Implementation of Neural Cellular Automata with interactive UI
CNNs in NumPy with model visualization
Genetic Algorithms for optimizing clustering algorithms by applying transformations to each dimension separately, applied to sports predictions

Web Scraping and Data Analysis

Parsing 30+ years of somewhat unstandardized 10-K annual reports for financial research
YouTube API data extraction and channel prediction via community analysis
Standardizing Quiz Bowl questions from various PDF formats into JSON files
Facebook Messenger log analysis for communication patterns and sentiment

Video Analysis

Automatic sports highlight extraction based on crowd audio levels
Zoom Office Hour content optimization by removing downtime

Other Projects I found Fun

Python AST modification to make quick testing easier. Specifically improving debugging and at run time changing floating point operations to Fractions calculations
ASCII art image and video converter, including a fully ascii art video player
Simple assembly-like language interpreter
Brainf*** JIT compiler and code generator with different heuristics and string algorithms for short code generation

Optimal Boggle Boards

Overview

Scoring Potential

Optimization

How It Was Made

Airport Management System

Overview

Views

Utility

How It Was Made

US State Image Geolocation

Overview

Model Architecture

Performance and Insights

Dataset and Training

Technical Implementation

Adversarial NLP

Overview

Background

Defense Mechanism Analysis

Defeating the Defense

Example of NLP attacks + Defense

Implications for Future Research

Hair Image Curvature Analysis

Overview

Algorithmic Problem

Algorithm Evaluation

How It Was Made

Kattis Contest Replay

Overview

The Challenge

The Solution

Key Features

Maximum Leaf Spanning Trees

Overview

The Challenge

Our Solution

Generating Challenging Test Cases

Results and Impact

Physics Calculator

Overview

The Challenge

The Solution

Implementation and Future Potential

Other Projects

Overview

Games AI

Research-Oriented AI

Web Scraping and Data Analysis

Video Analysis

Other Projects I found Fun

© 2024 Benjamin Zydney