Here's a question fleet managers ask constantly: "Should I buy lighter vehicles or downsize engines to improve fuel economy?" The answer lies in the data — but only if you know which vehicle attributes actually predict MPG and which are just noise. When we analyzed the classic automotive dataset (392 vehicles from 1970-1982), weight emerged as the dominant predictor, showing a -0.83 correlation with fuel efficiency. That's not coincidence. That's physics. For every 1,000 pounds you add to a vehicle, you sacrifice approximately 7.6 MPG. Horsepower costs you another 2.4 MPG per 100 HP. But here's where it gets interesting: weight, horsepower, displacement, and cylinder count are so intercorrelated (0.89 to 0.93) that they're essentially measuring the same underlying phenomenon — vehicle size and power. This guide walks through automotive fuel efficiency analysis step by step, showing you how to identify the attributes that matter and separate true predictors from redundant variables.

What Is Automotive Fuel Efficiency Analysis and When Do You Need It?

Automotive fuel efficiency analysis is a correlation and regression analysis that quantifies the relationship between vehicle specifications (weight, horsepower, displacement, cylinders) and fuel economy (MPG). You're not running experiments here — you're analyzing observational data to identify patterns. That means you can find strong correlations, but claiming causation requires careful thinking about confounders.

You need this analysis when you're making decisions where fuel economy is a key outcome variable:

  • Fleet procurement: You're buying 200 vehicles and need to balance performance requirements against fuel costs. Which specifications give you the best MPG per dollar?
  • Product development: You're engineering a new vehicle platform. How much MPG improvement can you expect from a 300-lb weight reduction versus a 20-HP engine downsize?
  • Regulatory planning: Your company needs to meet CAFE standards. Which design changes deliver the most MPG gain per engineering hour?
  • Market positioning: You're targeting fuel-conscious buyers. What's the MPG threshold that separates "economy" from "standard" segments, and which attributes define that line?
  • Predictive modeling: You want to estimate MPG for vehicles not yet tested, based on their published specifications.

The analysis doesn't tell you what caused historical MPG improvements (that would require controlled experiments or quasi-experimental designs), but it does tell you which attributes are most predictive — and that's often enough for practical decision-making.

Correlation vs. Causation in Automotive Data

When you see that weight correlates with MPG at r = -0.83, you might be tempted to say "reducing weight causes better fuel efficiency." That's probably true — physics supports it — but the correlation alone doesn't prove it. Lighter vehicles might also have smaller engines, better aerodynamics, or newer technology. To make a causal claim, you'd need to control for confounders or run experiments where you randomly assign weight levels (impractical for vehicles). For fleet decisions, the correlation is enough: buy lighter vehicles, get better MPG. For engineering decisions, you need to dig deeper into the mechanism.

Descriptive Statistics

Descriptive Statistics
Descriptive Statistics — Preview from case study

Before you start looking for relationships, you need to understand the distribution of your variables. The descriptive statistics table shows mean, median, standard deviation, min, and max for MPG, weight, horsepower, displacement, cylinders, and model year across all 392 vehicles.

Average MPG sits at 23.4, but there's enormous spread: the standard deviation is 7.8 MPG, with a range from 9 MPG (gas guzzlers) to 46.6 MPG (ultra-efficient economy cars). That's a 5:1 ratio. Weight averages 2,970 lbs but ranges from 1,613 lbs to 5,140 lbs — another 3:1 ratio. Horsepower averages 104 HP with a standard deviation of 38 HP. Displacement averages 194 cubic inches (ranging from 68 to 455). The median cylinder count is 4, but the dataset includes everything from 3-cylinder microcars to 8-cylinder muscle cars.

Here's what matters: the variance in these attributes gives you statistical power to detect relationships. If every vehicle weighed 3,000 lbs ± 50 lbs, you couldn't learn much about how weight affects MPG. But with a 3,500-lb range, the correlation analysis has plenty of leverage. The wide spread in MPG (9 to 46.6) also means you're capturing vehicles across the full efficiency spectrum, not just a narrow band.

One more thing: check for skewness. MPG, weight, and horsepower are all roughly symmetric (mean ≈ median), which is good news for linear regression. If you saw heavy skew, you'd consider log-transforming the variables before modeling. But here, the distributions are clean enough to work with directly.

Attribute Correlation Matrix

Attribute Correlation Matrix
Attribute Correlation Matrix — Preview from case study

The correlation matrix is the single most important table in this analysis. It shows Pearson correlation coefficients between every pair of variables. Values range from -1 (perfect negative correlation) to +1 (perfect positive correlation), with 0 meaning no linear relationship.

Here's what jumps out: weight has the strongest correlation with MPG at r = -0.83. That's a very strong negative relationship — as weight increases, MPG drops, consistently and predictably. Displacement follows closely at r = -0.80, then cylinders and horsepower both at r = -0.78. Model year shows a positive correlation of +0.58, reflecting fuel efficiency improvements over time (1970-1982 saw massive MPG gains due to oil crises and CAFE standards).

But here's the critical insight: the predictor variables are massively intercorrelated with each other. Weight and displacement correlate at +0.93. Weight and horsepower at +0.86. Displacement and cylinders at +0.95. These aren't independent predictors — they're different manifestations of the same underlying factor: vehicle size and power. When you add an extra cylinder, you increase displacement, which adds weight, which requires more horsepower. They move together.

This is called multicollinearity, and it has consequences. If you build a regression model with all four predictors, the coefficients become unstable and hard to interpret. You can't say "holding displacement constant, a 1-cylinder increase reduces MPG by X" because in real-world vehicles, cylinder count and displacement don't vary independently. For prediction, multicollinearity is fine — the model still works. For causal interpretation, it's a nightmare. You'd need to choose one primary predictor (probably weight, since it has the strongest correlation and the clearest physical mechanism) and control for the others, or use dimension reduction techniques like principal component analysis.

How to Read Correlation Strength

Pearson's r measures linear correlation strength. Rule of thumb: |r| < 0.3 is weak, 0.3-0.7 is moderate, >0.7 is strong. The automotive data shows strong correlations across the board. Weight at -0.83 means that 69% of MPG variance (r²) is explained by weight alone. But don't ignore moderate correlations — a 0.58 correlation with model year is still meaningful, especially when you're tracking time trends.

Weight vs. Fuel Efficiency

Weight vs. Fuel Efficiency
Weight vs. Fuel Efficiency — Preview from case study

The scatter plot makes the relationship visceral. Each point is a vehicle, plotted by weight (x-axis) and MPG (y-axis), color-coded by cylinder count. The downward trend is unmistakable: heavier vehicles get worse fuel economy. The relationship is remarkably linear — not exponential, not logarithmic, just a straight diagonal band from top-left (light, efficient) to bottom-right (heavy, thirsty).

Here's what you can read off the chart: vehicles under 2,000 lbs cluster in the 35-45 MPG range. Vehicles in the 2,500-3,000 lb range typically deliver 25-30 MPG. Vehicles above 4,000 lbs rarely break 15 MPG. There's some scatter around the trend line (that's the 31% of variance not explained by weight), but the central tendency is clear. For every 1,000 lbs you add, you lose roughly 7-8 MPG.

The color coding reveals another pattern: cylinder count tracks closely with weight. The 4-cylinder vehicles (blue) dominate the light-and-efficient zone (under 2,500 lbs, above 25 MPG). The 8-cylinder vehicles (red) cluster in the heavy-and-thirsty zone (above 3,500 lbs, under 20 MPG). The 6-cylinder vehicles (green) occupy the middle ground. This isn't surprising — it's another manifestation of the multicollinearity we saw in the correlation matrix. Bigger engines require bigger vehicles, which weigh more, which need more power. It's a design bundle.

For fleet managers, the takeaway is simple: if MPG is a priority, set a hard weight ceiling. A 3,000-lb maximum would eliminate most of the low-MPG outliers. For engineers, the chart suggests diminishing returns at the extremes: going from 4,500 lbs to 4,000 lbs might gain you 2-3 MPG, but going from 2,500 lbs to 2,000 lbs could gain you 4-5 MPG. The relationship is linear, but the practical impact depends on where you start.

MPG Distribution by Cylinder Count

MPG Distribution by Cylinder Count
MPG Distribution by Cylinder Count — Preview from case study

The box plot breaks down MPG distribution by cylinder count (3, 4, 5, 6, 8 cylinders). This is where you see not just the average difference, but the full spread: median, quartiles, range, and outliers. It's a more complete picture than a simple mean comparison.

4-cylinder vehicles show a median MPG around 29, with an interquartile range (IQR) of roughly 26-32 MPG. The distribution is fairly tight, with a few high-efficiency outliers pushing above 40 MPG. 6-cylinder vehicles drop to a median around 19 MPG, with an IQR of 17-21 MPG. 8-cylinder vehicles bottom out at a median near 14 MPG, with an IQR of 12-16 MPG. The difference between 4-cylinder and 8-cylinder medians is about 15 MPG — more than double the fuel efficiency.

What's interesting is the overlap. The bottom quartile of 4-cylinder vehicles dips down to 23-24 MPG, which overlaps with the top quartile of 6-cylinder vehicles. And the top quartile of 6-cylinder vehicles overlaps with the bottom of 4-cylinder. This tells you that cylinder count alone doesn't determine MPG — there's enough variance within each group that a well-designed 6-cylinder can beat a poorly-designed 4-cylinder. But on average, fewer cylinders wins.

The outliers are instructive. You see a handful of 4-cylinder vehicles hitting 40+ MPG (the ultra-efficient economy cars from the early 1980s), and a few 8-cylinder vehicles scraping the bottom at 10-12 MPG (the muscle cars and full-size sedans from the early 1970s). These outliers aren't errors — they're real vehicles at the extremes of the design space. If you're building a predictive model, you need to decide whether to include them (for generalizability) or exclude them (for robustness). For fleet decisions, they define the boundaries of what's possible.

Why Box Plots Beat Bar Charts for Distributions

A bar chart showing "average MPG by cylinder count" would give you one number per group. A box plot gives you five: minimum, Q1, median, Q3, maximum. That extra information matters. It tells you whether the distributions overlap, whether there are outliers, and how much spread exists within each group. Always visualize distributions, not just means. Means can lie.

Fuel Efficiency Trend by Model Year

Fuel Efficiency Trend by Model Year
Fuel Efficiency Trend by Model Year — Preview from case study

The time series plot shows average MPG by model year from 1970 to 1982. This is where you see the external shocks — the 1973 oil embargo and the 1979 energy crisis — hit the data like a hammer. Average MPG was flat around 17-18 MPG from 1970-1973, then jumped to 22 MPG by 1974. It plateaued briefly in the mid-1970s, then surged again from 1978-1980, reaching 31+ MPG by 1982. That's a 79% improvement over 12 years.

Here's why this matters: the correlation between model year and MPG (+0.58) reflects a real engineering response to market conditions. When gas prices spiked and CAFE standards kicked in, automakers redesigned their fleets. They didn't just optimize existing platforms — they introduced fundamentally lighter, smaller vehicles with more efficient engines. The jump from 1973 to 1974 (+3.9 MPG in one year) and from 1979 to 1980 (+4.2 MPG) were step changes, not gradual improvements. That's what happens when external pressure forces rapid innovation.

For analysts, this time trend is both a signal and a confounder. It's a signal if you're interested in how fuel efficiency evolves over time — maybe you're forecasting future MPG trends or estimating the impact of new regulations. It's a confounder if you're trying to isolate the effect of weight or horsepower on MPG — because weight, horsepower, and model year are all correlated. Newer vehicles are lighter and more efficient. If you don't control for year, you'll overestimate the weight effect. If you don't control for weight, you'll overestimate the year effect.

This is where multiple regression becomes essential. You'd run a model like MPG ~ weight + horsepower + model_year to partial out the independent effects. Correlation analysis tells you that all three variables matter; regression tells you how much each matters when you hold the others constant.

How to Interpret Your Results and Make Decisions

You've seen the correlations, the scatter plots, the distributions, and the time trends. Now what? Here's how to turn these patterns into decisions.

For Fleet Procurement

Set specification thresholds based on the data. If you need 25+ MPG on average, the box plot tells you that 4-cylinder vehicles are your safest bet — the median is 29 MPG, and even the bottom quartile exceeds 23 MPG. If you can tolerate 20 MPG, 6-cylinder vehicles become viable. The scatter plot adds another constraint: don't buy anything above 3,000 lbs if MPG is a priority. Weight is the single strongest predictor, so a weight ceiling is the most effective procurement rule.

Use the correlation matrix to avoid redundant specifications. Don't write a requirement like "must have ≤6 cylinders AND ≤250 cubic inches displacement AND ≤3,000 lbs weight" — those are all measuring the same thing (r = 0.93-0.95). Pick one primary constraint (weight) and let the others follow naturally. You'll get better vendor responses and avoid over-constraining the solution space.

For Product Development

Prioritize weight reduction over engine downsizing if you can only do one. Weight has the strongest correlation with MPG (-0.83 vs. -0.78 for horsepower), and reducing weight often enables smaller engines as a secondary benefit. The scatter plot shows that the weight-MPG relationship is linear and consistent across the full range — there's no weight threshold where the benefit disappears.

But don't ignore the time trend. The 1970-1982 data shows that fuel efficiency improved 79% in 12 years, driven by regulatory pressure and technology advances. If you're designing a vehicle for 2030, you need to benchmark against current technology, not 1982 technology. The correlation structure (weight, horsepower, displacement) probably still holds, but the baseline MPG for a given weight class has shifted. Use the historical data to understand relationships, not absolute levels.

For Predictive Modeling

Build a multiple regression model with weight as the primary predictor, horsepower as a secondary predictor, and model year as a control variable. Don't include displacement and cylinders — they're too collinear with weight (r = 0.93-0.95) and add more noise than signal. A three-variable model MPG ~ weight + horsepower + model_year will likely explain 80-85% of variance with stable, interpretable coefficients.

Check your residuals. If the scatter around the regression line is random (no patterns, constant variance), you're good. If you see patterns — like higher residuals for certain cylinder counts or certain years — you might need interaction terms or nonlinear transformations. The scatter plot suggests the relationship is linear, but always validate with residual diagnostics.

Try Automotive Fuel Efficiency Analysis on Your Own Data

Upload your vehicle specifications (CSV or Excel) and get a complete correlation analysis in under 60 seconds. Identify which attributes predict MPG, visualize the weight-efficiency trade-off, and export publication-ready charts.

Run Your Analysis →

What This Analysis Tells You (and What It Doesn't)

Automotive fuel efficiency analysis is powerful for identifying predictive relationships. It tells you that weight, horsepower, displacement, and cylinders are all strongly correlated with MPG. It tells you that weight is the dominant predictor. It tells you that fuel efficiency improved dramatically from 1970 to 1982. All of that is actionable for fleet decisions, product planning, and predictive modeling.

But here's what it doesn't tell you: it doesn't prove causation. The correlation between weight and MPG is almost certainly causal (physics supports it), but the analysis alone doesn't prove it. Lighter vehicles might also have better aerodynamics, more efficient transmissions, or newer engine technology — all confounders. To claim that weight causes MPG changes, you'd need to control for those confounders (in a regression model) or run an experiment where you randomly assign weight levels (impractical).

This is why I'm skeptical when analysts say "this data proves X causes Y" without experimental evidence. Correlation analysis is a starting point, not an ending point. It generates hypotheses (weight reduction improves MPG) that you then test with more rigorous methods — controlled experiments, quasi-experimental designs like regression discontinuity, or instrumental variable analysis.

For most business decisions, though, the correlation is enough. You don't need a randomized trial to decide that buying lighter vehicles will improve your fleet's fuel economy. The -0.83 correlation gives you high confidence, and the linear scatter plot shows the relationship is consistent. Just don't overclaim. Say "weight is the strongest predictor of MPG" (true), not "weight is the only factor that matters" (false).

When Correlation Analysis Isn't Enough

There are cases where you do need causal inference, not just correlation. For example:

  • Regulatory impact estimation: If you want to know whether CAFE standards caused the 1974-1982 MPG improvements (vs. just coinciding with them), you need a quasi-experimental design that compares regulated vehicles to a control group (maybe Canadian or European vehicles not subject to CAFE). Correlation with model year tells you efficiency improved, but not why.
  • Component-level optimization: If you're deciding whether to invest in lightweight materials (aluminum vs. steel), the aggregate weight-MPG correlation isn't enough. You need to isolate the marginal effect of material choice, holding engine, aerodynamics, and transmission constant. That requires either A/B testing (build two versions, measure MPG) or matched-pair analysis (find vehicles identical except for material choice).
  • Counterfactual forecasting: If you want to know "what would 2025 MPG look like if we'd never passed CAFE standards?", correlation analysis can't answer that. You need a structural causal model or a synthetic control method.

For these questions, causal inference techniques are the right tool. But for most fleet and product decisions, the correlation analysis gives you what you need: a clear ranking of which attributes predict MPG, how strong the relationships are, and whether the patterns are consistent enough to bet on.

Common Pitfalls and How to Avoid Them

Pitfall #1: Including too many correlated predictors. If you build a regression model with weight, displacement, horsepower, and cylinders (all correlating at 0.86+), the coefficients become unstable. A small change in the data flips the signs or changes the magnitudes. Solution: pick one primary predictor (weight) and drop the others, or use dimension reduction (principal component analysis) to combine them into a single "size/power" factor.

Pitfall #2: Ignoring time trends. The 1970-1982 dataset shows a strong upward trend in MPG over time (+0.58 correlation with model year). If you don't control for year in your regression model, you'll confound the weight effect with the technology-improvement effect. Newer vehicles are lighter and more efficient for other reasons (better engines, aerodynamics, transmissions). Always include year as a control variable if your data spans multiple years.

Pitfall #3: Extrapolating beyond the data range. The scatter plot shows vehicles from 1,613 lbs to 5,140 lbs. Don't use the regression equation to predict MPG for a 6,000-lb vehicle or a 1,000-lb vehicle — you're outside the observed range, and the linear relationship might not hold. Extrapolation assumes the pattern continues indefinitely, which is rarely true. Stick to interpolation within the observed range.

Pitfall #4: Treating correlation as causation. We've covered this, but it's worth repeating. A -0.83 correlation between weight and MPG is strong evidence of a relationship, but it doesn't prove that reducing weight will improve MPG. There could be confounders (lighter vehicles have smaller engines, better aerodynamics, newer tech). For causal claims, you need experimental or quasi-experimental designs. For predictive claims, correlation is fine.

Pitfall #5: Ignoring outliers without investigation. The box plot shows a few 4-cylinder vehicles hitting 40+ MPG and a few 8-cylinder vehicles scraping 10 MPG. Don't automatically drop them as "outliers" — they're real vehicles at the extremes of the design space. Investigate why they're outliers. Maybe they represent breakthrough designs (the 40+ MPG cars) or obsolete technology (the 10 MPG cars). Either way, they contain information. Only exclude outliers if you have evidence they're measurement errors, not genuine extreme values.

Always Validate Regression Assumptions

If you build a predictive model from this data, check four assumptions: (1) linearity (scatter plot should show a straight trend), (2) independence (residuals should be uncorrelated — watch out for time-series autocorrelation), (3) homoscedasticity (residual variance should be constant across the range), and (4) normality of residuals (histogram should be bell-shaped). Violating these assumptions doesn't always break the model, but it degrades the accuracy of confidence intervals and p-values.

How MCP Analytics Makes This Analysis Effortless

Running automotive fuel efficiency analysis manually means writing code to load the data, compute correlations, generate scatter plots, create box plots, fit regressions, and export charts. That's 2-3 hours if you're fluent in R or Python. If you're not, it's a full day of Stack Overflow searches.

MCP Analytics compresses that to 60 seconds. Upload your CSV with columns for MPG, weight, horsepower, displacement, cylinders, and model year. The platform automatically generates:

  • Descriptive statistics table (mean, median, SD, range for all variables)
  • Correlation matrix with color-coded heatmap
  • Weight vs. MPG scatter plot with cylinder-count color coding
  • MPG distribution by cylinder count (box plot)
  • MPG trend by model year (line chart)
  • Multiple regression output (coefficients, R², p-values)
  • Downloadable report with all charts publication-ready

You don't write a single line of code. You don't debug plot formatting. You don't manually check correlation assumptions. The platform handles it, and you get a complete analysis in the time it takes to get a coffee. For fleet managers, that means you can run the analysis during a procurement meeting and make data-driven decisions on the spot. For engineers, it means you can iterate through multiple design scenarios (what if we cap weight at 2,500 lbs? 3,000 lbs? 3,500 lbs?) and see the MPG impact immediately. For analysts, it means you spend your time interpreting results, not wrestling with code.

Related Techniques and When to Use Each

Automotive fuel efficiency analysis is one tool in a broader toolkit. Here's how it fits with related techniques:

Correlation Analysis: This is the foundation. Use it whenever you want to quantify the strength and direction of relationships between two or more continuous variables. In the automotive context, it tells you which attributes (weight, horsepower, displacement) correlate most strongly with MPG. Limitation: it doesn't tell you which variable is the cause and which is the effect. It just says they move together.

Multiple Regression: This is the next step after correlation. Use it when you want to predict a continuous outcome (MPG) from multiple predictors (weight, horsepower, year) while controlling for confounders. Regression gives you coefficients that quantify the marginal effect of each predictor (e.g., "a 1,000-lb increase in weight reduces MPG by 7.6, holding horsepower and year constant"). It's more powerful than correlation for prediction and causal inference, but it requires you to specify a model structure (linear, polynomial, interactions) and validate assumptions.

Time Series Analysis: Use this when the time ordering of your data matters. The MPG-by-year trend is a simple time series. If you wanted to forecast future MPG based on historical trends, you'd use time series methods like ARIMA or exponential smoothing. If you wanted to detect the exact point where fuel efficiency jumped (1973? 1974?), you'd use change point detection. For the automotive dataset, time series analysis would help you model the 1970-1982 trend and project forward to 1990 or 2000.

Causal Inference: Use this when you need to make causal claims, not just predictive claims. If you want to know whether CAFE standards caused the MPG improvements (vs. just coinciding with them), correlation and regression aren't enough — you need a quasi-experimental design. Techniques like difference-in-differences, regression discontinuity, or instrumental variables let you estimate causal effects from observational data. For the automotive dataset, you might compare U.S. vehicles (subject to CAFE) to Canadian vehicles (not subject) in a difference-in-differences framework.

Cluster Analysis: Use this when you want to group vehicles into natural segments based on their attributes. Instead of predicting MPG, you're asking "are there distinct vehicle archetypes in the data?" Cluster analysis might reveal groups like "economy cars" (light, low HP, high MPG), "muscle cars" (heavy, high HP, low MPG), and "mid-size sedans" (middle of the road). This is useful for market segmentation and product positioning.

Frequently Asked Questions

What vehicle attributes have the strongest correlation with fuel efficiency?

Weight shows the strongest negative correlation with MPG (r = -0.83), followed by displacement (r = -0.80) and horsepower (r = -0.78). Cylinder count correlates at r = -0.78. These high intercorrelations (weight, displacement, and horsepower correlate at 0.89-0.93 with each other) suggest they're measuring overlapping aspects of vehicle size and power.

How much does vehicle weight affect fuel economy?

For every 1,000 pounds of vehicle weight, fuel efficiency drops approximately 7.6 MPG. A 2,000-lb vehicle might achieve 35+ MPG, while a 4,500-lb vehicle typically delivers under 15 MPG. This relationship is remarkably linear across the weight spectrum.

Did the oil crises of the 1970s actually improve fleet fuel efficiency?

Yes — dramatically. Average MPG rose from 17.7 in 1970 to 31.7 by 1982, an 79% improvement. The steepest gains occurred 1973-1974 (+3.9 MPG) and 1979-1980 (+4.2 MPG), immediately following the oil embargoes. This demonstrates how regulatory pressure and consumer demand can drive rapid engineering improvements.

Can you predict MPG from vehicle specifications alone?

Yes, with reasonable accuracy. Weight, horsepower, and cylinder count together explain roughly 80-85% of MPG variance. However, correlation analysis alone doesn't tell you the causal mechanism — confounding variables like engine technology, aerodynamics, and transmission type also play roles. For causal claims, you'd need controlled experiments or instrumental variable analysis.

Should fleet managers focus on weight reduction or engine downsizing to improve fuel economy?

Weight reduction offers the most direct path. Because weight, displacement, and horsepower are so highly intercorrelated (0.89+), reducing weight often enables smaller engines and lower horsepower — delivering compounding MPG gains. A 500-lb weight reduction could improve fuel efficiency by 3-4 MPG while maintaining acceptable performance in many use cases.

The Bottom Line: What You Can Learn from Vehicle Attribute Data

Automotive fuel efficiency analysis is observational data science at its most practical. You're not running experiments — you're mining historical data to find patterns that predict outcomes. The patterns are strong: weight correlates with MPG at -0.83, explaining 69% of variance by itself. Add horsepower and model year, and you're explaining 80-85% of variance. That's enough predictive power for fleet procurement, product planning, and regulatory forecasting.

But remember the limits. Correlation doesn't prove causation. The data shows that lighter vehicles get better MPG, but it doesn't prove that making vehicles lighter will improve MPG (though physics strongly suggests it). There could be confounders. The only way to know for sure is to run a controlled experiment — or at least use quasi-experimental methods to isolate the causal effect.

For most business decisions, though, you don't need causal proof. You just need strong predictive relationships. And this analysis delivers them. Weight is the dominant predictor. Horsepower is secondary. Time trends matter. Cylinder count and displacement are redundant once you control for weight. Armed with those insights, you can make smarter procurement decisions, set better design targets, and build more accurate predictive models.

And if you need to run the analysis yourself? Upload your CSV to MCP Analytics and get results in 60 seconds. No coding required. No statistics degree required. Just data in, insights out.

Ready to Analyze Your Fleet Data?

Upload vehicle specifications, get instant correlation analysis, scatter plots, box plots, and regression models. Identify the attributes that drive MPG and make data-driven procurement decisions.

Get Started →