Python’s lpsolver for Fantasy Soccer

by Ainara

On Thursday February 20, 2014, Bill Mill gave a great talk at Microsoft’s NERD Lab.  There were a collection of talks that evening focused on showcasing cool things one can do with Python.   Fortuitously, Bill’s project included optimization as well!

Bill, an outsider to soccer, wanted to choose a fantasy soccer team.  Players of varying abilities cost a certain amount and Bill had a budget.  So how should he choose a team that would maximize his chances of winning?

blog post1

You could pick them manually but you’d have to really know the players and maybe there are too many decisions to handle efficiently.  Also, it would be a lot of work.

Luckily Bill is an ace at programming in Python and decided to “see if I could get my computer to pick a better team for me than the one I had picked myself.”

He found the /web/api endpoint in the Premier League’s hmtl code (  This allowed him to download the data for each player as a JSON file.  To manage the data, Bill used a new trendy interface called ipython notebook.  He wrote code to iterate through the hundreds of players, counting the number of points scored in their position.  He was also able to add up the points that teams allow; there as a lot of variation in this (Liverpool only gave up 2 fantasy points, whereas Fulham 4).  He realized a player’s opponent would be important in modeling his future points.

Bill’s final model predicted a player’s expected points based on (1) if the opponent was easier to score on, (2) if he was playing at home, and (3) his past average score.

With this metric for each player, Bill could now write a simple IP,

maximize expected team value

st, total player cost < 100

2 goalkeepers

5 defenders

5 midfielders

3 forwards

The output is 1 if a player was chosen and 0 if not.  I would highly suggest looking at Bill’s presentation slides.  They are impeccably made and you can see the Python code for yourself.  For example, the objective function was written like,

def objective_function():

m = ” + “.join(“{ev} {p.pos}{p.idn}”.format(p=p, ev=p.expected_points())

for p in player_objs)

return “max: ” + m + “;\n

And the output Bill got,


[#2 ARS Wojciech Szczesny £60 gk,

#8 ARS Per Mertesacker £66 d,

#46 AVL Leandro Bacuna £44 m,

#63 CAR Pete Whittingham £53 m,

#69 CAR Jordan Mutch £46 m,

#82 CHE John Terry £67 d,

#130 EVE Seamus Coleman £66 d,

#214 LIV Luis Surez £134 f,

#232 MCI Gnegneri Yaya Tour £101 m,

#297 NOR John Ruddy £49 gk,

#326 SOU Jose Fonte £52 d,

#328 SOU Luke Shaw £49 d,

#333 SOU Adam Lallana £77 m,

#342 SOU Rickie Lambert £70 f,

#343 SOU Jay Rodriguez £64 f]

blog post

In the end, Bill’s IP was not game changing but he did perform in the upper third group.  There are several aspects of Bill’s project that resonated with me.  It was cool that he used Python.  I’ve only used Python for web scraping, but I’ll be interested in giving lpsolver a go now.

I also liked Bill’s approach in using a simple IP to help support a decision in Fantasy Soccer, an unexpected and quotidian use of optimization.  It shows that models don’t have to be overly complicated in order to be useful.  I think more convex programs such as these should be more widespread in daily life.

One of my favorite optimization papers attempts to optimize diet.  Written in April 2013, Professor Dimitris Bertsimas and Dr. Allison O’Hair at MIT designed an IP to help diabetic patients construct a meal and exercise plan.  Their objective function is

minimize lambda(BG – 140) – ƩƩpi­­­tyi­­­t

where BG is the blood glucose level and lambda is a parameter to weight the emphasis placed on blood glucose.  The second part of the objective maximizes the patient’s preferences p for the particular foods y consumed.

They have many constraints, but those are mainly to ensure serving size, calories, blood glucose, and exercise are at the right levels.  Again, a very simple model that can be very effective.  I am currently taking a class at MIT taught by both Allison and Professor Bertsimas.  I’m hoping to build off their model for my final project and make it more applicable to non-diabetics like me!  (e.g., What if I want to maximize nutrition instead?)

As for Bill and his soccer optimization, Adam and I will be both be expanding on his model for our AM221 project.  We are going to apply it to Fantasy Football.  Instead of maximizing the expected score, we seek to maximize the probability of scoring more than the opponent.  Since we are dealing with probabilities now, we will be optimizing the CDF of the difference of two Normals, which is a sigmoid function.  So stay tuned for how we wrestle with that!


Bill’s awesome slides and pictures, check them out at

Bertsimas, Dimitris and Allison O’Hair.  “Personalized and Adaptive Diabetes and Diet Management: A Robust Optimization Approach.”  April 2013.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s