Skip to main content
Home

Main navigation

  • Home
User account menu
  • Log in
By Skander, 29 November, 2024

Stochastic Multi-armed Bandit - Thompson Sampling With Beta Distribution

We have previously explored two multi-armed bandit (MAB) strategies: Maximum Average Reward (MAR) and Upper Confidence Bound (UCB). Both approaches rely on the observed average reward to determine which arm to pull next, using a deterministic scoring mechanism for decision-making.

MAB - Thompson Sampling With Beta Distribution
By Skander, 15 November, 2024

The Exploration-Exploitation Balance: The Epsilon-Greedy Approach in Multi-Armed Bandits

In this article, I will explore the balance between exploration and exploitation, a key concept in reinforcement learning and optimization problems. To illustrate this, I will use the multi-armed bandit problem as an example. I will also explain how the epsilon-greedy strategy effectively manages this balance.

Exploration versus exploitation
By Skander, 12 November, 2024

Comparison of Three Multi-armed Bandit Strategies

In a previous article, I introduced the design and implementation of a multi-armed bandit (MAB) framework. This framework was built to simplify the implementation of new MAB strategies and provide a structured approach for their analysis. Three strategies have already been integrated into the framework: RandomSelector, MaxAverageRewardSelector, and UpperConfidenceBoundSelector. The goal of this article is to compare these three strategies.
Comparison of multi-armed bandit strategies
By Skander, 8 November, 2024

Design and Implementation of A Unifying Framework For Multi-armed Bandit Solvers

In previous blog posts, we explored the multi-armed bandit (MAB) problem and discussed the Upper Confidence Bound (UCB) algorithm as one approach to solving it. Research literature has introduced multiple algorithms for tackling this problem, and there is always room for experimenting with new ideas. To facilitate the implementation and comparison of different algorithms, we introduce a framework for MAB solvers.

Multi-armed bandit framework
By Skander, 3 November, 2024

Analyzing the Upper Confidence Bound Algorithm

This article focuses on evaluating the implementation of the Upper Confidence Bound (UCB) algorithm discussed herein. The evaluation is conducted using a single dataset provided by Super Data Science.

Number of impressions for each ad over time.
By Skander, 1 November, 2024

A Python Implementation of The Upper Confidence Bound Reinforcement Learning Algorithm

This article explores the implementation of a reinforcement learning algorithm called the Upper Confidence Bound (UCB) algorithm. Reinforcement learning, a subset of artificial intelligence, involves an agent interacting with an environment through a series of episodes or rounds. In each round, the agent makes a decision that may yield a reward. The agent's ultimate objective is to learn a strategy that maximizes its cumulative reward over time.

multi-armed bandit
By Skander, 14 August, 2024

Clean and Reusable Property Validation Using TypeScript Decorators

In this blog post, we’ll explore how to implement two custom object property validators using TypeScript decorators. While popular libraries like class-validator already provide a rich set of decorator-based validators, our goal here is to demonstrate how to build your own—specifically, a @Positive validator and a @NotEmpty validator—in a clean and reusable way.

typescript logo
By Skander, 15 June, 2013

Inferno By Dan Brown

I am an avid reader of Dan Brown books. I loved reading "Angels and Demons". His book "The Davinci Code" motivated me to learn more about the three major monotheist religions from a historical point of view.

I was anticipating the publication of "The Lost Symbol" book. The book came after a three-years delay and it was such a disappointment. I had the impression that Daniel Brown was writing his book for Holywood and not for his readers. I said to myself, Brown is dead as an author and he will never dare to publish a book again. A passing fashion.

Dan Brown Inferno Book Cover
Novels
  • More From Skander

My Apps

  • One-dimensional Cellular Automata Simulator
  • Collatz (Syracuse) Sequence Calculator / Visualizer
  • Erdős–Rényi Random Graph Generator / Analyzer
  • KMeans Animator
  • Language Family Explorer

New Articles

The Hundred-Page Language Models Book - Book Review
A Utility for Converting TSPLIB files to JSON Format
Escape YouTube Filter Bubble - An LLM-based Video Recommender
Implementing a 1-D Binary-State Cellular Automaton with TypeScript, Svelte, and PixiJS
A Parametric Approach to Cellular Automata Framework Design

Skander Kort