6.894 : Interactive Data Visualization
Assignment 1: visualization design.
In this assignment, you will design a visualization for a small data set and provide a rigorous rationale for your design choices. You should in theory be ready to explain the contribution of every pixel in the display. You are free to use any graphics or charting tool you please – including drafting it by hand. However, you may find it most instructive to create the chart from scratch using a graphics API of your choice.
(See Resources for a list of visualization tools.)
Data Set: U.S. Population, 1900 vs. 2000
Every 10 years, the census bureau documents the demographic make-up of the United States, influencing everything from congressional districting to social services. This dataset contains a high-level summary of census data for two years a century apart: 1900 and 2000. The data is a CSV (comma-separated values) file that describes the U.S. population in terms of year, reported sex (1: male, 2: female), age group (binned into 5 year segments from 0-4 years old up to 90+ years old), and the total count of people per group. There are 38 data points per year, for a total of 76 data points.
Dataset: CSV Source: U.S. Census Bureau via IPUMS
- Start by choosing a question you'd like a visualization to answer.
- Design a static visualization (i.e., a single image) that you believe effectively answers that question, and use the question as the title of your graphic.
- Provide a short write-up (no more than 4 paragraphs) describing your design.
While you must use the data set given, you are free to transform the data as you see fit. Such transforms may include (but are not limited to) log transformation, computing percentages or averages, grouping elements into new categories, or removing unnecessary variables or records. You are also free to incorporate external data as you see fit. Your chart image should be interpretable without recourse to your short write-up. Do not forget to include title, axis labels or legends as needed!
As different visualizations can emphasize different aspects of a data set, you should document what aspects of the data you are attempting to most effectively communicate. In short, what story are you trying to tell? Just as important, also note which aspects of the data might be obscured or down-played due to your visualization design.
In your write-up, you should provide a rigorous rationale for your design decisions. Document the visual encodings you used and why they are appropriate for the data and your specific question. These decisions include the choice of visualization type, size, color, scale, and other visual elements, as well as the use of sorting or other data transformations. How do these decisions facilitate effective communication?
The assignment score is out of a maximum of 10 points. Historically, the median score on this assignment has been 8.5. We will determine scores by judging both the soundness of your design and the quality of the write-up. We will also look for consideration of audience, message and intended task.
We will use the following rubric to grade your assignment. Note, rubric cells may not map exactly to specific point scores.
We will reward entries that go above and beyond the assignment requirements to produce effective graphics. Examples may include outstanding visual design, meaningful incorporation of external data to reveal important trends, demonstrating exceptional creativity, or effective annotations or other narrative devices.
Submission Details
This is an individual assignment. You may not work in groups. Your completed assignment is due on Wednesday 2/12, by noon .
Submit your assignment using this form . The form expects your visualization to be an image (either a .png or .jpg). Please make sure your image is sized for a reasonable viewing experience -- readers should not have to zoom or scroll in order to effectively view your submission!
Resubmissions. Resubmissions will be regraded by teaching staff, and you may earn back up to 50% of the points lost in the original submission. To resubmit this assignment, please use this form and follow the same submission process described above. Include a short 1 paragraph description summarizing the changes from the initial submission. Resubmissions without this summary will not be regraded. Resubmissions will be due by 11:59pm on Saturday, 2/29. Slack days may not be applied to extend the resubmission deadline. The teaching staff will only begin to regrade assignments once the Final Project phase begins, so please be patient.
- Due: 12pm, Wed 2/12
- The Dataset
- Submission Form
IMAGES
VIDEO
COMMENTS
Visuals allow data scientists to summarize thousands of rows and columns of complex data and put it in an understandable and accessible format. By bringing data to life with insightful plots and charts, data visualization is vital in decision-making processes.
In this assignment, you will design a visualization for a small data set and provide a rigorous rationale for your design choices. You should in theory be ready to explain the contribution of every pixel in the display.
Data visualization is the graphical representation of information and data. By using v isual elements like charts, graphs, and maps, data visualization tools provide an accessible way to see and understand trends, outliers, and patterns in data.
Data analysts know how to ask the right question; prepare, process, and analyze data for key insights; effectively share their findings with stakeholders; and provide data-driven recommendations for thoughtful action.
What you'll learn. Develop a project proposal. Assess the quality of the data and perform exploratory analysis. Create KPIs and dashboards and assess your analysis. Create your data story and write a narrative to accompany your visualization.
Learn to visualize data effectively with Python in this IBM course on Coursera. Create various graphs, charts, and interactive dashboards using libraries like Matplotlib, Seaborn, Folium, Plotly & Dash. Hands-on labs and a final project included.