Unfolding the universe of possibilities..

Whispers from the digital wind, hang tight..

Designing Operations Research Solutions: A User-friendly Routing Application with Streamlit

30.09.2023

16 minutes

55 Views

From mathematical models to software engineering in Python

Photo by Caspar Camille Rubin on Unsplash

Bridging the gap between theory and applications is essential in Operations Research and Data Science. Whereas theoretical foundations form the core of optimization solutions as they provide the means to solving complex problems, we should also be concerned with how to make these concepts accessible and actionable for practical use.

The traveling salesman problem (TSP) is undoubtedly the most extensively studied problem in combinatorial optimization (Rego et al., 2011). It is easy to describe (verbally at least) and can be used to show some possible components of modern routing APIs. Therefore, I just could not think of a better alternative to use in this story.

In this tutorial, you’ll learn how to use the Python library Streamlit to build a web application to solve the TSP based on user-provided input data. As we are concerned with real-world applications, the solution goes beyond just Euclidean distances. It should be able to extract actual road driving distances between points using their coordinates and incorporate these distances into the optimization process. To do so the OpenStreetMap API will be used.

If you are interested in understanding better theoretical aspects of numerical optimization, you might want to check my stories on Linear Programming and the Vehicle Routing Problem (which is a generalization of the TSP).

Are you ready for some hands-on? Have a glimpse at our final results…

Screen capture of the final application. (Animation by the author).

As the entire code might be too long to include in this story, part was just referenced but omitted. You can check the complete code in this repository though.

The Traveling Salesman Problem

The TSP consists of finding the shortest tour connecting N points. As the number of points considered increases, due to the combinatorial nature of the problem, finding an optimal or near-optimal solution can become very challenging.

Good quality solution for TSP with 1000 locations obtained using custom heuristics. (Image by the author).

From exact methods to heuristics and meta-heuristics, the techniques to solve the TSP must consider pairwise distances between elements to obtain a solution. Although, in this article, I will not dive into details of solution methods, consider a generic function to solve the TSP has the following signature:

solve_tsp(distances: np.ndarray) -> List[int]

You can find how to implement some solution strategies in the original code repository. Here, however, we will focus on how something like solve_tsp can be seen as a component in software development.

Repository folder structure

Throughout this tutorial, a simple flat folder structure will be adopted with the following elements:

.streamlit: This folder is where general Streamlit settings will be placed.data: The folder that stores datasets used throughout development and that might be regular examples.assets: This folder stores elements referenced in the main app script, such as images, logos, etc.optimizer: An internal Python package used in the app.app.py: The main Streamlit script.Dockerfile: The instructions to build a Docker image for the application.requirements.txt: Python dependencies to run the application.README.md: Project description..dockerignore and .gitignore: As their names suggest, files ignored by Docker build and by Git respectively.tsp-app/
│
├── .streamlit/ # General streamlit configuration
│ └── config.toml
│
├── data/ # Directory for data if any
│ ├── dataset.csv
│ └── …
│
├── assets/ # These will be referenced in the app
│ ├── dataset.csv
│ └── …
│
├── optimizer/ # TSP optimizer internal package
│ ├── __init__.py
│ └── …
│
├── app.py # Main Streamlit script
│
├── Dockerfile # Docker configuration file
├── .dockerignore
│
├── .gitignore # File ignored in Git
├── requirements.txt # Python dependencies
└── README.md # Project description

More complex applications might create a separate package for utility functions of the Streamlit interface itself.

…
├── optimizer/ # TSP optimizer internal package
│ ├── __init__.py
│ └── …
│
├── interface/ # Utility functions for the interface
│ ├── __init__.py
│ └── …
│
├── app.py # Main Streamlit script
…

Just be careful and ensure that you define paths in your project based on their relative position to the directory from where you run the script or launch the application.

First, make sure you have all your dependencies installed in your Python environment.

pip install -r requirements.txt

In the case of the TSP app, we might face some dependency conflicts, so I suggest you install ortools separately with the –no-deps option.

pip install –no-deps ortools==9.7.2996

Using the flat structure proposed, you can run the application from the root of the directory by running the following command line:

streamlit run app.py

Or you can build a docker image using the Dockerfile provided in my code repository using the following command.

docker build -t tsp_app:latest .

You can replace tsp_app by any repository name you would like and latest by any version.

Then you can execute your app by running

docker run -p 8501:8501 tsp_app:latest

Let us understand better how to use the app.py file.

General structure

A general structure of the app.py script can be summarized by the following pseudocode:

Imports

Definition of utility functions (could have been in a separate file)
Declaring constants
Declaring default session_state attributes
Parsing solver parameters
Parsing input file

if input is not None:
read (and cache) input data

if execute button is pressed:
solve model based on input
store solution as a session_state attribute

if solution is not None:
make output available
plot results

Although it might look simple, some points of this pseudocode are crucial:

When pressing a button, Streamlit runs the code conditioned to it and resets its state. So follow Streamlit guidelines and avoid keeping these elements inside the conditions of a button:

Displayed items that should persist as the user continues.Other widgets which cause the script to rerun when used.Processes that neither modify session state nor write to a file/database.*

And that’s why we will explore as much as possible session_state attributes. After finishing the optimization, we want to keep exploring the map even if we press the download button or change some configuration without pressing the “execute” button. Therefore, displaying the solution is conditional only to its existence in session_state, not to the “execute” button.

Furthermore, some operations might be expensive, and it is useful to cache their results to avoid wasting time with re-runs. Remember: Streamlit runs your script from top to bottom at every user interaction or code change.

Basic configuration

As previously described, we can define general Streamlit settings in the config.toml file inside the .streamlit folder. In the TSP example, I used it to define colors and font. This is kind of a light purple layout, but you can try different colors. Just make sure to define them in Hex color codes.

[theme]
primaryColor = “#5300A5”
backgroundColor = “#403E43”
secondaryBackgroundColor = “#1C1B1E”
textColor = “#F5F3F7”
font = “sans serif”
base = “dark”

Let us start filling the content of app.py with the Python imports. The package os will be used to manage file paths in our operating system; BytesIO will be used to emulate an output downloadable file kept in memory instead of disk; json will be used to serialize our output solution; List is just a typehint.

To work with dataframes, pandas will be used; the solver Highs is imported from pyomo to solve our problem (in case of a MIP); streamlit is the base of the interface; and streamlit_folium will be used to insert a folium Map into the app. Additional custom functions defined in the internal package are also imported.

import os
from io import BytesIO
import json
from typing import List

import pandas as pd
from pyomo.contrib.appsi.solvers.highs import Highs
import streamlit as st
from streamlit_folium import st_folium

from optimize.tsp import get_distance_xy, build_mip, solve_mip, TSP, plot_tour, request_matrix,
plot_map, get_coord_path

The session_state attributes will store the tour of a current solution, the pandas dataframe of a given input file, the value of that solution, and the route coordinate path (obtained from OpenStreetMap).

# Create current solution as session_state
if “tour” not in st.session_state:
st.session_state.tour = None

if “dataframe” not in st.session_state:
st.session_state.dataframe = None

if “sol” not in st.session_state:
st.session_state.sol = None

if “route_path” not in st.session_state:
st.session_state.route_path = None

Some of the utility functions used are:

driving_distances: To obtain the distance matrix from OpenStreetMap provided a dataframe (important to cache results here).upload_callback: Resets session_state attributes when a new input file is loaded.update_path: Given a tour, obtain map coordinates of the correspoinding driving path to plot results.# Finds distance matrix
@st.cache
def driving_distances(dataframe: pd.DataFrame):
return request_matrix(dataframe)[“distances”] / 1000

# Callback uploading a new file
def upload_callback():
st.session_state.tour = None
st.session_state.sol = None
st.session_state.route_path = None

# Update route path after solution
def update_path(tour: List[int], dataframe: pd.DataFrame):
coord_rt = dataframe.loc[tour, :]
path = get_coord_path(coord_rt)
st.session_state.route_path = path

And then let us configure the page layout. In the following script, we include an icon and a title for the web page; then we include the same icon on the sidebar and write an introduction in markdown style. I suggest you run streamlit run app.py and check the results so far.

# Path to icon
icon_path = os.path.join(“assets”, “icon_tsp.png”)

# Set the page config to wide mode
st.set_page_config(
page_title=”TSP”,
page_icon=icon_path,
layout=”wide”,
)

st.sidebar.image(icon_path)

st.title(“TSP”)
st.write(“Welcome to the Traveling Salesman Problem solver.”)

One of the utility functions I defined reads content from the README.md file and display this content if the user selects this option. As I want to keep the conditional to display the content independent of a possible re-run, I used a selectbox instead of a button to do so.

display_tutorial = st.checkbox(“Display tutorial”)
if display_tutorial:
section = st.selectbox(“Choose a section”, [“Execution”, “Solutions”, “Contact”], index=1)
tutorial = read_section(section)
st.markdown(tutorial)

And then let us define solver parameters and upload an input file…

Input data and solver parameters

In this example I included two options for input type:

‘xy’: Uses euclidean distances, and input data must have columns ‘x’ and ‘y’.‘lat-long’: Uses road driving distances, and input data must have columns ‘lat’ and ‘long’.

I also included two solver options:

‘MIP’: Uses models created with pyomo and solved with HiGHS.‘Heuristic’: Uses Google OR-Tools routing algorithms based on constructive + local search heuristics with multi-start.

These parameters as well as a solver time limit will be placed on the app sidebar by the following code:

problem_type = st.sidebar.selectbox(“Choose an input type:”, [“xy”, “lat-long”], index=0)
method = st.sidebar.selectbox(“Choose a strategy:”, [“MIP”, “Heuristic”], index=0)
time_limit = st.sidebar.number_input(“Time limit”, min_value=0, value=5, step=1)

To upload the file, the file_uploader function will be used. Notice that our function to reset session_state attributes is called on change of input.

file = st.file_uploader(“Upload input file”, type=[“csv”], on_change=upload_callback)

In case the input file is not empty (None), we should read it and prepare the distance matrix to be ready for the optimization…

if file is not None:
dataframe = pd.read_csv(file)
st.session_state.dataframe = dataframe
distances = FORMATS[problem_type](dataframe)
start_button = st.button(“Optimize”)

Notice that previously FORMATS had been defined as a dictionary with functions that return a distance matrix given a dataframe and a problem type. These functions were defined in the internal module and imported into the app.

FORMATS = {
“xy”: get_distance_xy,
“lat-long”: driving_distances
}

Execution

Now the execution relies on pressing the “Execute” button. This is the case of a once-per-click process. Based on this condition, the optimizer will be executed, and the results will be stored in session_state attributes.

# Run if start is pressed
if file is not None and start_button:

# Solve MIP
if method == “MIP”:
solver = Highs()
solver.highs_options = {“time_limit”: time_limit}
model = build_mip(distances)
tour = solve_mip(model, solver)
st.session_state.tour = tour

# Solve Heuristic
elif method == “Heuristic”:
model = TSP(distances)
tour = model.solve(time_limit=time_limit)
st.session_state.tour = tour

# Display the results
sol = model.obj()
st.session_state.sol = sol

# Update path in case of lat-long
if problem_type == “lat-long”:
update_path(tour, dataframe)

You can include additional widgets during these function calls. For instance, a spinner or a progress bar might look well during execution.

And then, outside the execution conditional to the button, we can make our output available depending only on its own existence.

# Compute current state variables
sol = st.session_state.sol
tour = st.session_state.tour
dataframe = st.session_state.dataframe

if sol is not None and tour is not None:
col_left, col_right = st.columns(2) # Divide space into two columns
col_left.write(f”Current solution: {sol:.3f}”)
col_right.download_button(
label=”Download Output”,
data=json.dumps(tour),
file_name=’output.json’,
mime=’json’,
)

In this simple example, we are making available a json file with the sequence of points visited in the tour. If we were supposed to write a more complex output, such as an Excel file, BytesIO could have been an interesting alternative. Suppose you want to create a download button that makes available a dictionary of pandas dataframes. You could use something like:

buffer = BytesIO()
with pd.ExcelWriter(buffer) as writer:
for name, df in dataframes.items():
df.to_excel(writer, sheet_name=name)

st.download_button(
label=”Download Output”,
data=buffer,
file_name=’output.xlsx’,
mime=’xlsx’,
)

Detailed routes with OpenStreetMap

Earlier in the app, we used the function request_matrix defined in the internal package to obtain the distance matrix of the problem using OpenStreetMap API and the Python library requests. In this function, behind the scenes, we are doing a request like:

f”https://router.project-osrm.org/table/v1/driving/{points}?sources={sources_str}&destinations={dest_str}&annotations={annotations}”

In which:

points: A string joining pairs of longitude and latitude separating coordinates of the same pair using “,” and different pairs using “;”.sources: A list of integers corresponding to sources (from) nodes.destinations: Analogous to sources, but with destinations (to) nodes.annotations: ‘distance’, ‘duration’, or ‘duration,distance’.

Now we want to obtain the detailed coordinates of the solution tour to visualize results. To do so, we will use a different request. Consider two functions that receive as an input a dataframe of elements from the tour already ordered. From the outputs retrieved by the API, we will create a list of tuples latitude, longitude and it will be passed to folium to create the map.

def get_request_points(coordinates: pd.DataFrame) -> List[str]:
return coordinates.apply(lambda x: f”{x[‘long’]},{x[‘lat’]}”, axis=1).to_list()

def get_coord_path(coordinates: pd.DataFrame) -> List[Tuple[float, float]]:
pts = get_request_points(coordinates)
pts_req = “;”.join(pts)
r = session.get(
f”http://router.project-osrm.org/route/v1/car/{pts_req}?overview=false&steps=true”
)
result = r.json()
first_route = result[“routes”][0]
coords = [(p[“location”][1], p[“location”][0])
for l in first_route[“legs”]
for s in l[“steps”]
for p in s[“intersections”]]
return coords

Plotting results with Folium

Now consider we have the list of tuples previously obtained by our function get_coord_path. We now must use it to create our folium map.

def plot_map(
path: List[Tuple[float, float]],
color: Union[str, tuple] = “darkblue”,
zoom_start=8,
**kwargs
):
# Coordinates from path
lat, long = zip(*path)

# Create map
m = folium.Map(zoom_start=zoom_start)
new_line = folium.PolyLine(
path, weight=4, opacity=1.0,
color=color, tooltip=f”Route”,
**kwargs
)
new_line.add_to(m)

# Trim zoom
sw = [min(lat), min(long)]
ne = [max(lat), max(long)]

m.fit_bounds([sw, ne])
return m

And finally, we can combine this map with the streamlit_folium library to create an amazing visualization of results.

if tour is not None and dataframe is not None:

# Plot solution
if problem_type == “xy”:
pass # Not really in the app, but skipping here

elif problem_type == “lat-long”:
map = plot_map(st.session_state.route_path)
st_folium(map, width=700, height=500, returned_objects=[])

Using the input file available in my code repository you can find and visualize the shortest driving route passing in all US continental state capitals.

Shortest driving tour between all continental US state capitals. (Image by the author).

Deploy

So far, our solution runs smoothly on local execution. However, it is probable that you would like to share your applications with a wider audience. This is where deployment becomes crucial. You can access the deployed TSP app HERE.

For this project, I chose to deploy the app using Google Cloud Run, as my idea was to deploy a containerized application (using Docker) and I was familiar with the enviroment. A concise yet useful tutorial to do similarly is available in this other story. Streamlit also offers a quick guide on this topic “How to deploy Streamlit using Docker”.

For the TSP app, I addressed the dependency conflict between ortools and streamlit within the Dockerfile. Other deployment methods lean heavily on the requirements.txt file, which might make it harder for this specific case. However, it is worth checking these “3 Easy Ways to Deploy your Streamlit Web App Online”.

I wish you all the best on your journey deploying your own Operations Research solutions!

Conclusions

In this story, the convergence of optimization and software development was explored using the simple yet effective web development framework Streamlit and the open-source routing API OpenStreetMap. By combining the core concepts of operations research with the ease of use of a web application, it is possible to create amazing tools to guide decision-making processes.

Reference

Rego, C., Gamboa, D., Glover, F., & Osterman, C., 2011. Traveling salesman problem heuristics: Leading methods, implementations and latest advances. European Journal of Operational Research, 211(3), 427–441.

Designing Operations Research Solutions: A User-friendly Routing Application with Streamlit was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.

M	T	W	T	F	S	S
				1	2	3
4	5	6	7	8	9	10
11	12	13	14	15	16	17
18	19	20	21	22	23	24
25	26	27	28	29	30

Unfolding the universe of possibilities..

Designing Operations Research Solutions: A User-friendly Routing Application with Streamlit

From mathematical models to software engineering in Python

The Traveling Salesman Problem

Repository folder structure

General structure

Basic configuration

Input data and solver parameters

Execution

Detailed routes with OpenStreetMap

Plotting results with Folium

Deploy

Further reading

Conclusions

Reference

Leave a Comment Cancel reply

Свежие записи

Свежие комментарии

Recent Posts

Deep Dive on Accumulated Local Effect Plots (ALEs) with Python

Turning your relational database into a graph database

Yes, you still need old-school NLP skills in “the age of ChatGPT”

The Two Documents Every Data Scientist Must Write Before Taking Interviews

Categories

Recent Comments

Архивы

Рубрики