Project Proposal

Team Project Objectives

  • Select a publicly available dataset from trustworthy primary or secondary data sources (see Stack Overflow Survey dataset in H3)
    • Examples of trustworthy data sources are government sources ( e.g. https://data.gov/ ), organizations specialized in polling and survey research, etc.
    • Avoid sites such as Kaggle where some datasets are highly questionable
  • Describe the selected dataset using the Aether Data Documentation Template (released August, 2022) and available at https://www.microsoft. com/en-us/research/project/datasheets-for-datasets/
  • Consider what can be learned from the data and formulate questions that will guide your analysis
    • Design the problem solving steps that answer the questions
    • Implement and test your design using Python, test-driven development, version control, and project management (as described in the guidelines)

Proposal Artifact

The project proposal is written in Markdown page in a file named PROPOSAL.md and located in the team GitHub remote repository.

Organization of Project Proposal

Use the template provided below. It has headings and prompts for the proposal sections.

  • Address clearly and precisely all the prompts in the template.
  • Both team members contribute to writing the proposal.
  • Keep the prompts in the proposal.
## {Project Proposal}

**Contributors**: {Team members' names}
**Date**: {Date proposal was created}

### Data Source
List the source of your dataset: name and link. 

### Data Documentation
Both team members contribute to this section.
- Briefly describe  the data set the project will investigate. See the Aether Data Documentation Template. 
- Focus on overview, data collection procedures, representativeness, data quality, privacy, and distribution and access
- Divide work equally and include authorship information for content in this section that each student wrote.

### Questions
Both team members contribute to this section.
- Write FOUR questions (two for each team member) that: 
    - Guide your project work 
    - Give interesting insights into the dataset.
- For each question: 
  - Indicate the dataset field(s) involved in your question.
  - Describe how the data in the selected fields relate to each other.
  - What other characteristics of the data would provide additional insights into the question.

Proposal Writing and Submission

  • Team members prepare the proposal document collaboratively.
    • The document MUST clearly state who has written what content in the proposal.
  • The final version of the document is in the docs directory of the remote repo.
    • Team members are encouraged to practice feature-branch workflow to collaboratively work on the proposal.
    • Version control is used to evidence team members’ collaboration.
  • Each team member creates a PDF version of the project proposal and uploads it to Canvas.

Evaluation

  • Although the proposal is not graded, by missing to submit it on time you will lose 10% of the project points.
  • The content of the proposal will be used in the project report.