Group Project Specification
Social Web Analytics
1 Aim
• 2 Method
• 3 Group Size and Organisation
• 4 Due date and Submission
• 5 Report Format
• 6 Marks
• 7 Declaration
• 8 Project Description
1 Aim
The Group Project provides us with a chance to analyse the Social Web using knowledge obtained from this unit with assistance from a computer based statistical package. For this project, we will focus on identifying a chosen companies Twitter image.
2 Method
To complete this project:
1. Read through this specification.
2. Form a group and register your group using the 300958 dashboard
3. Choose a company that is active on Twitter, check that it is not already on the list of company names (login required). Then submit the Twitter handle of the company using the “Submit Twitter handle” section of the 300958 dashboard. Once submitted, your chosen Twitter handle will appear on the the list of company names with a time stamp. Note that a given company cannot be allocated to more than one group. If duplicate companies are found on the list, the group with the later time stamp will be asked to find a new company.
4. Complete the data analysis required by the specification.
5. Write up your analysis using your favourite word processing/typesetting program, making sure that all of the working is shown and that is it presented well.
6. Include the student declaration text on the front page of your report. Please make sure that the names and student numbers of each group member are clearly displayed on the front page.
7. Submit the report as a PDF by the due date using the “Upload Assignment” section of the 300958 dashboard.
3 Group Size and Organisation
Students in groups of size 4 are to work together to complete this project. One project report is to be submitted per group.
The group must be formed by signing-up to a group within the Project section of 300958 in vUWS. It is not compulsory for all students in a group to be from the same lab class, but it is generally a good idea.
Groups must be formed by the end of week 9. Once the group is formed, A ‘manager’ should be nominated from within the group to be responsible for submitting the report.
4 Due date and Submission
The project report is due in by 11:59 p.m. on the Friday of week 13. The report must be submitted as a PDF file using the assignment submission facilities in the Project section of 300958 in vUWS. Only one student from each group (the manager) needs to submit the assignment.
5 Report Format
Once the required analysis is performed by the group, the members of the group are to write up the analysis as a report. Remember that the assessor will only see the groups report and will be marking the group’s analysis based on your report. Therefore the report should contain a clear and concise description of the procedures carried out, the analysis of results and any conclusions reached from the analysis.
The required analysis in this specification covers the material presented in lectures and labs. Students should use the computer software R to carry out the required analysis and then present the results from the analysis in the report.
6 Marks
This project is worth 30% of your final grade, and so the project will be marked out of 30. The project consists of four investigations and will be marked using the following criteria:
Marks Criteria Satisfied
0-5 One of the project parts have been completed correctly.
6-10 Two of the project parts have been completed correctly.
11-15 Three of the project parts have been completed correctly.
16-20 All of the project parts have been completed correctly.
21-25 The required work has been completed correctly and the company questions have been answered based on the results.
There are also five marks allocated to presentation (based on the report formatting, style, grammar and mathematical notation). If the report looks like something that would be submitted to an employer, then the full five marks will be awarded.
If a report is submitted late, the maximum mark it can achieve will be reduced by 10% (3 marks) per day. E.g., if a report is submitted five days late, it can receive at most 15 marks.
7 Declaration
The following declaration must be included in a clearly visible and readable place on the first page of the report.
________________________________________
By including this statement, we the authors of this work, verify that:
• We hold a copy of this assignment that we can produce if the original is lost or damaged.
• We hereby certify that no part of this assignment/product has been copied from any other student’s work or from any other source except where due acknowledgement is made in the assignment.
• No part of this assignment/product has been written/produced for us by another person except where such collaboration has been authorised by the subject lecturer/tutor concerned.
• We are aware that this work may be reproduced and submitted to plagiarism detection software programs for the purpose of detecting possible plagiarism (which may retain a copy on its database for future plagiarism checking).
• We hereby certify that we have read and understand what the School of Computing and Mathematics defines as minor and substantial breaches of misconduct as outlined in the learning guide for this unit.
Note: An examiner or lecturer/tutor has the right not to mark this project report if the above declaration has not been added to the cover of the report.
________________________________________
8 Project Description
A company is investigating its public image and has approached your team to identify what the public associates with the company name. The company wants the four pieces of analysis to be performed.
8.1 Analysis of Twitter language
In this section, we want to examine the language used in tweets.
1. Download the random sample of tweets randomSample2016.csv, and load using randomSample = read.csv(“randomSample2016.csv”)
2. Use the tm library to construct a document-term matrix of term frequencies.
3. Sum the columns to obtain a vector of term frequencies summed over all tweets.
4. Compute the proportion of each term in the random sample, from the vector of term frequencies.
5. List the top 10 words and their proportion.
What do these words tell us about the random sample?