Data Shapes Perception. Perception Shapes Reality.

What shapes crime data?

Crime totals reflect more than incidents. They are shaped by reporting, policing patterns, and structural factors that algorithms may ignore.

Policing density

More policing often leads to more recorded incidents.

Population density

Busier areas naturally produce more reports.

Reporting bias

Some communities report incidents differently.

Click a factor to explore how it shapes interpretation.

Interactive map

Hover over a district to preview incident totals, or click to explore how those numbers can turn into broader assumptions about place, risk, and community identity. This view shows totals rather than per capita rates.

Incident totals

Lower

Higher

Colors show relative totals from the cleaned 2015–2022 dataset.

District boundaries come from the Boston Police Districts GeoJSON. This map shows total reported incidents by district, not population-adjusted rates. A per capita view could change how districts compare by accounting for differences in population size and activity density.

Explore Boston Districts

Click a district below to see an example of how a system might interpret the data and why that interpretation can be misleading.

District totals are based on combined Boston crime data (2015–2022) processed in Python.

Select a district

Start by clicking one of the district buttons. This panel will update with sample interpretation text.

What the data shows at a glance

These totals come from the combined Boston crime data files from 2015–2022. High counts may look objective, but numbers alone do not explain context, neighborhood size, policing patterns, or how risk gets interpreted.

Highest recorded totals

B2 — 99,635 incidents
C11 — 86,111 incidents
D4 — 84,927 incidents

Lower recorded totals

A7 — 27,895 incidents
E5 — 29,050 incidents

What algorithms might do

A system could turn these totals into rankings, flags, or risk scores. That can make the output seem neutral, even when it hides important social and geographic context.

What a ranking might emphasize

These five districts represent the highest totals in the dataset. Visual rankings like this quickly draw attention to the top values, emphasizing contrast and order while leaving out the context behind why those differences exist.

99,635

C11

86,111

84,927

72,755

72,162

How bias happens

Crime data gets grouped by place
Algorithms may treat high counts as high “risk”
Context gets removed
Entire communities can be judged unfairly

Why this matters

For Boston students and residents, neighborhood reputation already affects where people live, travel, and feel safe. This project asks what happens when algorithms strengthen those assumptions.

Key takeaways

These patterns are not just about numbers. They show how data can shape assumptions about people, place, and safety when context is left out.

Data is not neutral in practice

Even when crime data looks objective, the way it is grouped, ranked, and interpreted can lead to biased conclusions about entire districts.

One number cannot define a place

Incident totals do not capture community history, reporting patterns, structural inequality, or the lived realities of the people who live there.

Algorithms can reinforce stereotypes

When systems turn district data into simplified labels like “high risk,” they can reproduce existing assumptions instead of helping people understand a place more fully.

Bias in action

Match each data point with a possible explanation. This section asks you to pause and consider how quickly numbers can turn into assumptions when context is missing.

There may seem like there is one obvious answer, but crime data can often be interpreted in multiple ways depending on what context is included or ignored.

Data point

Possible explanation

Select one card from the left and one from the right to make a match.

Sources, disclaimer, and further reading

This project combines public crime data with research on algorithmic bias, fairness, and policing. The district totals shown here are meant to illustrate how data can be interpreted by systems, not to define any neighborhood or community. Crime counts alone do not capture structural inequality, over-policing, underreporting, population differences, or lived experience.

Dataset

Boston crime data (2015–2022) from Kaggle, cleaned and summarized in Python for this project.

View the dataset