Recent Changes

Saturday, December 20

  1. page 4_Work edited ... Can we use social data collected from Facebook and LinkedIn to find perfect employee? Probabl…
    ...
    Can we use social data collected from Facebook and LinkedIn to find perfect employee?
    Probably not the absolutely perfect employee, but the information from those sites (as well as possibly other social media sites) can definitely help you make informed decisions when it comes to hiring.
    Social media is just one facet of how people choose to represent themselves online. It is by no means an indicator of their professional personality. For example, someone's Instagram could have no pictures relating to their field of expertise, and another person could have all their posts related to their profession. This difference in how people choose to present themselves in a casual social setting designed to be shared with friends, not recruiters, does not reflect how they will perform as an employee. I believe social media can be used as a precautionary filter, but it does not showcase enough (or any) of that person's skill level.
    Do you see oDesk as a platform to change the real world jobs not only online jobs?
    Yes - the methodology that oDesk provides in order to choose freelancers is very similar to the shift that the world is undergoing today. Many job applications are now through LinkedIn, which provides testimonials and proficiencies very similar to that of an oDesk profile.
    (view changes)
    11:02 pm
  2. page 6_Learning edited ... Recently, new forms of education are growing rapidly. Especially, MOOCs (Massively Open Online…
    ...
    Recently, new forms of education are growing rapidly. Especially, MOOCs (Massively Open Online Courses), which are free online courses provided by famous universities are spreading worldwide. For example, the largest MOOCs provider Coursera has more than 10 million users, providing more than 800 courses.
    {MOOCs.png} Example of MOOCs (Udacity)
    A recent MOOC that UC Berkeley has recently been more involved in is edX: https://www.edx.org/school/uc-berkeleyx.
    These online classes allow anyone, whether they are a registered student at another university or not, to sign up and take free courses. There are additional support systems, such as bulletin boards and online "office hours" in which students can interact with each other and the staff. There are graded homeworks and exams with feedback. In addition to making education more accessible to the general public, edX has gained popularity through the schools that use it, which encourage those students to try out new courses online that they do not have to commit to.

    As well as MOOCs, there are also some other forms of digital solutions for education, such as digital textbooks and software for learning languages.
    {digital_textbook.jpg} Example of Digital Textbook
    (view changes)
    10:58 pm

Sunday, December 7

  1. page HW5_Final edited ... Assigned: Fri Nov 7, 2014 Due: Sat Nov 22, 2014 at noon Subject: 16 pages reading for last …
    ...
    Assigned: Fri Nov 7, 2014
    Due: Sat Nov 22, 2014 at noon
    Subject: 16 pages reading for last class (Nov 18) and final homework (Nov 25)
    Social Data Revolutionaries!
    Please email me any questions you have, and / or points you would like me to consider incorporating in the last class, by Sunday evening, Nov 16, 2014.
    My goal for our last class on Tuesday, Nov 18, 2014, is to bring out the unifying dimensions across the many aspects of social data we encountered, as well as to come to an overall understanding of how the different examples our guest speakers gave fit together in the course.
    To prime you for the final class…
    I know you’ve already done a lot of work for class: made a video, wrote an essay, looked around oDesk, and worked on the course wiki. So I kept the final homework relatively small with a good ratio between thinking and learning vs your writing.
    I am attaching Please email me if you have not received the16-page book proposal that I just finished. It is currently under review and I ask you to please not distribute it outside class. Your task is for each of the eight chapters and the conclusion, to come up with
    • One crisp sentence that summarizes the main idea of the chapter
    • What do you disagree with and/or what do you think is missing (e.g., can you give better examples)?
    • How can the argument be sharpened?
    Please submit your responses using the Google Form at http://bit.ly/ischool2014hw5 by Tuesday, Nov 25, 2014.
    FYI, the first half of the last class will focus on Data Ownership. Pete Warden who busted Apple for logging geolocation without telling its users, then also crawled 220 Million public Facebook profiles, and recently sold his company JetPac to Google, will share some of his stories onm the social data revolution, http://bit.ly/ischool2014warden.
    And to give you all a chance to talk to him and each other, the Social Data Lab will get food catered by Julie’s Cafe for everybody right after class, just outside 202. Hope you can all stay and mingle and eat! And if you know someone who might be interested in taking the course next year, please invite them to join us on Tuesday.
    Looking forward to seeing you again for our final class!
    Andreas
    PS: Several of you asked me about a discount for the pebble watches. Susan made it easy for us: Just order it starting at https://getpebble.com/promotions/berkeleybuds. (I think mine came to around USD 60.) This is of course not required for class and whether you get one or not has no impact on your grade.

    Direct any questions to Raven Jiang (raven at cs dot stanford dot edu)
    (view changes)
    3:09 pm

Monday, December 1

  1. page 8_Ownership edited ... Techopedia: Data Exhaust http://www.techopedia.com/definition/30319/data-exhaust CNBC: Data…
    ...
    Techopedia: Data Exhaust
    http://www.techopedia.com/definition/30319/data-exhaust
    CNBC: Data mining is now used to set insurance rates
    http://www.cnbc.com/id/101586404
    Forbes: 'God View': Uber Allegedly Stalked Users
    http://www.forbes.com/sites/kashmirhill/2014/10/03/god-view-uber-allegedly-stalked-users-for-party-goers-viewing-pleasure/

    (view changes)
    5:12 am
  2. page 8_Ownership edited ... So, what does this tell us about the data? It shows the incredible amount of data that is avai…
    ...
    So, what does this tell us about the data? It shows the incredible amount of data that is available, if one actively searches for it. However, Andreas’ data may not be important to you because he is only a single individual. But as Pete Warden pointed out, a single person’s data might not be super significant, but when individual’s data is tied with others in the aggregate, it can be extremely insightful.
    Seeing all the things that we can learn about ourselves, and probably of more concern, the things that others can learn about you through the data exhaust, the question always comes back to: who owns this data? As has been underlined through the class, the data exhaust is important for products and services to best cater to you, but this data must be transparent, to create a more balanced, symmetric relationship so that some of the power can be moved towards the consumer, which is very important in the digital economy.
    How Companies Use Personal Data
    Insurance companies are no longer allowed to use data about pre-existing conditions for their health insurance policies, but they are looking to use new data sources. According to Robert Hunter, the Consumer Federation of America's director of insurance, insurance companies use a "data mining tool that lets insurance companies figure out which groups of customers are more likely to accept a price increase and which are more likely to shop around for a new policy."
    Insurance companies have traditionally used complicated equations to calculate risk, using the information that their clients willingly offer. Now, they may have alternate methods of getting more data to help them calculate risk. Consumers have more data made available to them, but they are also generating more data that can be mined by companies, which may not always be in their favor.
    Other companies, and especially web retailers, track online clicks to learn more about their consumers. Amazon is known for their personalized recommendations, which they generate by analyzing which products their consumers look at. Other companies have since adopted similar methods in order to provide recommendations for their consumers.
    Facebook also tracks clicks very carefully, on their own website, and on thousands of other websites as well. Every time you are logged into Facebook, and visiting an external website that has the Facebook "Like" button embedded, Facebook knows you have visited that website. It is common that they will then serve you an ad from that website, or even the product you were looking at, shortly after.
    Recently, Uber has been in the media for a variety of ethical violations. One of them includes using a "god view" of the app, which shows passenger's personal identifying information, as well as live tracking the passenger's Uber ride. There is a reasonable degree of expectation that passengers have when agreeing to share their personal information and use a service like Uber, and Uber failed to uphold this standard.

    Mitigating harm from data
    According to Brad Rubenstein, your presence in a data set is a danger to you whether you know it or not. Your information can always be used in ways that you might not agree with in order to exploit you. Once the information is out there, it’s impossible to get it back. Because of this, companies and induviduals collecting data have a responsibility to think of the ways in which the data they are collecting could be use to harm individuals. They also have a
    (view changes)
    5:08 am

Sunday, November 30

  1. page 8_Ownership edited ... http://youtu.be/N_C00zQpcqw Transcript: https://www.dropbox.com/s/sgvikvugfxecp6y/weigend_i…
    ...
    http://youtu.be/N_C00zQpcqw
    Transcript:
    https://www.dropbox.com/s/sgvikvugfxecp6y/weigend_ischool2014_8.docx?dl=0
    8_Data Ownership and the Future of Data
    Timeline Nov 18, 2014
    (view changes)
    9:22 am

Wednesday, November 26

  1. page 8_Ownership edited ... 7:00 DINNER with the students responsible for this wiki page. Introduction (Linda) data Da…
    ...
    7:00 DINNER with the students responsible for this wiki page.
    Introduction (Linda)
    dataData ownership
    data
    (Linda)
    Data
    stewardship
    data
    (William)
    Data
    governance
    Brief History
    (William)
    Data Exhaust (Michael)
    Definition
    Significance
    Example: archive.orgArchive.org
    How companies use consumer dataCompanies Use Personal Data (Sophie)
    insurance

    Insurance
    companies
    online

    Online
    tracking (Amazon/Facebook/etc…) -> potential(Amazon/Facebook/etc.) à Potential tracking in real life
    “leaked”

    “Leaked”
    data not
    ...
    interest (Uber)
    4) Mitigating harm from data

    Mitigating Harm From Data
    (Holly)
    Rubenstein’s two types of data ownership
    Dangers for consumer
    Problems and questions to consider
    5. CustomizationCustomization & Privacy Settings accordingAccording to preferencePreference (Noelle)
    Potential for being more rewarding through personalization
    Methods
    Transparency in what consequences certain settings have
    Discussion Questions
    Data Ownership and the Future of Data
    Introduction
    (view changes)
    2:21 pm
  2. page 8_Ownership edited ... 6:30 END 7:00 DINNER with the students responsible for this wiki page. Introduction (Willia…
    ...
    6:30 END
    7:00 DINNER with the students responsible for this wiki page.
    Introduction (William)(Linda)
    data ownership
    data stewardship
    data governance
    Brief History
    Data Exhaust (Julian)(Michael)
    Definition
    Significance
    ...
    Methods
    Transparency in what consequences certain settings have
    Data ownershipOwnership and the futureFuture of dataData
    Introduction - Linda
    Humans have used devices to aid computation and storage of data for thousands of years (Wikipedia, Information Technology). The Sumer, one of the ancient civilizations and historical regions in southern Mesopotamia (Wikipedia, Sumer), for example used clay tablets to document land ownership or access to resources such as water. Such data was extremely valuable, as the owners of such data records had the control over the distribution of resources. Thus, already 3000 years Before Christ, data ownership was directly associated with power. Today, it is far easier to collect, store and process data than it was with the ancient Sumer. However, the topic of data ownership remains more important than ever. The aim of this document is to provide an overview of the concept of data ownership and to discuss the importance of data and privacy in today’s society.
    What is data exhaust? - Julian
    Dictionary.com defines data exhaust as unstuctured information or data that is a byproduct of the online activities of internet users.
    Everything you do online or with almost any smart device you interact with creates an exhaust. While seemingly insignificant these little bits of data are on the surface when aggregated and structured they can provide tremendous portraits of not just what everyone is doing online but also what you personally are doing to the people who gather that data. This data ranges from your IP Address which can show your location and uniquely identifies you, to bounced emails, to the background information stored on your phone (i.e. your metadata). To get a better understanding of what your metadata looks like a German newspaper Die Zeit gathered six months of cell phone metadata and compiled that information into a visual representation. A large way you are generating exhaust is through your browser and what settings you use, cookies you have, and data you broadcast. The EFF offers a site were you can see how incognito (or not so incognito) you are through some of the data you broadcast at panoptic. So naturally two questions arise,
    If you are generating all of this data which seems like it can be worth a lot of money, who owns it?
    How do you control what data is being is being collected about you and analyzed?
    We will address the first point in the next section but as to the second point aside from giving up on technology and becoming amish there is little you can do to control what data is being collected. However, while you can't stop people from collecting your data you can manage the exhaust you generate from using smart devices and the internet. The first step in this process is to be aware in what data you are revealing about yourself when you are online (including metadata) and whether there is anything about the data you are putting online that would bother you if people analyzed especially maliciously. While most of the time people who are analyzing metadata are only trying to improve their service and thereby improve your experience there are also people out there who will be able to derive surprising amounts of information about you and your life from the data you give off. Your social network generates a lot of really useful data but the task is still up to you to manage that data if there is something you would rather be kept private. One way to do this while still remaining on a social network is to intersperse deceptive information into the data along with truthful information, that way the data people are collecting about you becomes less reliable. Of course if everyone did this the value of the data analysis of your social network would become less valuable so you should weigh what you are gaining with what you are giving up before corrupting your online social network data. Another way of managing your online presence and IP is through a VPN or encrypting all of your data through the TOR network. While this won't protect information you post publicly it will hide some of the exhaust you create while browsing. A good list of ideas of how to protect your privacy can be found on eff.org's web site here.
    While we have just spent a long time explaining ways to protect your digital exhaust it is important to remember that not all exhaust is bad and not all people analyzing exhaust are bad. Most of the time the people who analyze your exhaust are doing it to improve your experience but since the exhaust starts out being created by you, you should also be aware of what you are creating and how it is being/ can be used.
    What
    {http://www.dynamicscrmpros.com/wp-content/uploads/2013/07/data-ownership.jpg}
    What
    is data ownership? - Linda
    According to techopedia.com, data ownership is defined as “the act of having legal rights and complete control over a single piece or set of data elements” (Technopedia, Data ownership). Being in control over data means having the ability to access, create, modify, share, sell, or remove data as well as having the right to give these privileges to trusted others (Lohsin, 2001). Assigning data ownership is not always easy and as described in Lohsin (2001), there are multiple paradigms of how to define the owner of data.
    Ownership paradigms
    From a business perspective, the owner can be a legal person such as an individual (e.g. a general practitioner) or an organization (e.g. an enterprise) that collects personal data from people. In such a scenario, one often speaks of data controllers. For example, a general practitioner is the controller of his patients’ data; an enterprise is the controller of client and employee data. Such data ownership comes with responsibilities. The controller, for example, is responsible for maintaining and delivering the data to its users as well as for ensuring that only authorized persons have access to the data and that necessary steps have been taken to manage data risks. Alternatively, it is also possible to see the creator of the data as the owner. For example, a geographic data consortium might collect geographical data and store it in a database. Similarly, one could argue that users of social networks are creators of data. Since the data shared on social networks is very personal, it is likely that users will claim ownership of this data. Another interesting paradigm is the model of global data ownership according to which data should be available to all without restrictions. This model is often used in scientific communities where the main goal is to share and increase common knowledge.
    ...
    data stewardship? - William
    A

    A
    data steward
    ...
    data element.
    What

    What
    is data governance? - William
    According to Wikipedia, Data Governance is a control that ensures that the data entry by an operations team member or by an automated process meets precise standards, such as a business rule, a data definition and data integrity constraints in the data model. The data governor uses data quality monitoring against production data in the goldensource to communicate errors in data entry back to the operations team members or to technology for corrective action. Through data governance, organizations are looking to exercise positive control over the processes and methods used by their data stewards and data custodians to handle data.
    As we discussed in class, data governance has become one of the most controversial topics with the rise of social media and other data platforms, such as electronic health records. The traditional governor of the data, in this case, the Facebook users and patients, do not seem to possess the strict and complete governance of their data anymore. However, as the topic being brought up in legislation more and more often, for example, in the HITECH act, a more well-defined data governance is under way.
    A briefData Exhaust
    In the digital age, humans create, share, and record more data in a single day than we did from the beginning of
    history to the year 2000. Therefore, the question that becomes important is how do we manage all this data, and who owns this data? With a growing fear of a surveillance state, issues that are centered around data governance? - William
    Mitigating
    ownership call into account the individual’s data exhaust.
    {http://www.1to1media.com/weblog/customer%20experience%20ecosystem.png}
    According to Techopedia, data exhaust is the “data generated as trails or information byproducts resulting from all digital or online activities.” As we make choices online, it contributes to storable data such as log files, cookies, temporary files, and all other digital processes or transactions. These pieces of data are collected and can be used to personalize user experience through targeted advertisements and unique recommendations. Data exhaust gives a more accurate picture into individual’s preferences, likes, habits, etc. that companies can use to provide better services and products that people are more likely to consume.
    Consequently, the question arises on how individuals and companies will manage the plethora of data. With the rising amount of information, even what some people would consider irrelevant information can prove to be actually important, and even in some cases, dangerous. In class, Pete Warden who is the co-founder of Jetpac, explained a scenario where his company was collecting massive amounts of public pictures and using image processing techniques to find out what was happening in the photos. For example, through people’s Instagram pictures, they could find out where the gay bars in San Francisco are. However, in another case, Pete ran across pictures of homosexuals in Tehran, Iran, enjoying themselves at bars. In Tehran, homosexuality is a crime that is punishable by their law. This shows an example how some individuals may not be completely aware of their own digital exhaust and the possible repercussions that can occur from it, such as posting a picture.
    Archive.org
    A great example of the massive amounts of stored information on the web is realized through the Internet Archive, or archive.org. This is a non-profit digital library with the goal of providing a universally-free internet and access to knowledge. Brad Rubenstein joked during class that we should look into Professor Weigend’s website (weigend.com) through the archive. So that’s just what we did.
    {archiveorg.png}
    Archive.org can give us insight into a few personal bits of how often the Professor edits the website, when it started, and the former information that was posted on the website. This is a great tool for Andreas because if he loses data or a page that he wrote on his website, he can go back through the archive and find it by looking through his own edits, shown through snapshots. We can also see how his activity has changed over the years, showing a peak in the years 2005 and 2008 where the most saves were made.
    So, what does this tell us about the data? It shows the incredible amount of data that is available, if one actively searches for it. However, Andreas’ data may not be important to you because he is only a single individual. But as Pete Warden pointed out, a single person’s data might not be super significant, but when individual’s data is tied with others in the aggregate, it can be extremely insightful.
    Seeing all the things that we can learn about ourselves, and probably of more concern, the things that others can learn about you through the data exhaust, the question always comes back to: who owns this data? As has been underlined through the class, the data exhaust is important for products and services to best cater to you, but this data must be transparent, to create a more balanced, symmetric relationship so that some of the power can be moved towards the consumer, which is very important in the digital economy.
    Mitigating
    harm from data (Holly)
    According to Brad Rubenstein, your presence in a data set is a danger to you whether you know it or not. Your information can always be used in ways that you might not agree with in order to exploit you. Once the information is out there, it’s impossible to get it back. Because of this, companies and induviduals collecting data have a responsibility to think of the ways in which the data they are collecting could be use to harm individuals. They also have a
    Rubenstein’s two types of data ownership
    ...
    It is difficult to approach the problem of data collection because data collection is hard to stop at source. In our current society, we cannot stop collecting data- it has become an automatic process and is an essential part of the way some companies operate. We can, however, educate people collecting data on how to mitigate harm. Journalists are starting to monitor and keeping track of the outputs/results, which increases . Shift from looking at the outcomes to looking at the use and how people are manipulating it. Like I want my medical information immediately available to the hospital and doctors treating me, but I don’t want insurance companies to get this information on me. There need to be limitations on what actions companies can take once they have that information.
    Personalization of Privacy
    by Noelle Reyes
    An individual’s notion of privacy and their desire to protect or disperse it ultimately depends on their preference. Factors that affect this include knowledge of the technology and familiarity with the means of controlling how their data is being used. Brad Rubenstein describes how consumers must be responsible for actively monitoring how companies use our data, rather than just telling them that they cannot have it.
    Current methods are often limited to one’s initiative to change default security settings on forms of social media or search engine use. For example, Google’s Incognito feature allows for the user to not have personal browsing information from their session be stored locally; yet, the servers of the websites you do visit still have access to your location and usage. How can this be modified to protect the identity of a user even a step further? Or on the opposing side, why do users feel the need that features such as Incognito are even necessary?
    ...
    EU Justice: Who can collect and process personal data?
    __http://ec.europa.eu/justice/data-protection/data-collection/index_en.htm__
    Techopedia: Data Exhaust
    http://www.techopedia.com/definition/30319/data-exhaust

    (view changes)
    2:07 pm
  3. page 8_Ownership edited ... data governance Brief History Data Exhaust (Michael) (Julian) Definition Significance …
    ...
    data governance
    Brief History
    Data Exhaust (Michael)(Julian)
    Definition
    Significance
    ...
    Introduction - Linda
    Humans have used devices to aid computation and storage of data for thousands of years (Wikipedia, Information Technology). The Sumer, one of the ancient civilizations and historical regions in southern Mesopotamia (Wikipedia, Sumer), for example used clay tablets to document land ownership or access to resources such as water. Such data was extremely valuable, as the owners of such data records had the control over the distribution of resources. Thus, already 3000 years Before Christ, data ownership was directly associated with power. Today, it is far easier to collect, store and process data than it was with the ancient Sumer. However, the topic of data ownership remains more important than ever. The aim of this document is to provide an overview of the concept of data ownership and to discuss the importance of data and privacy in today’s society.
    What is data exhaust? - Julian
    Dictionary.com defines data exhaust as unstuctured information or data that is a byproduct of the online activities of internet users.
    Everything you do online or with almost any smart device you interact with creates an exhaust. While seemingly insignificant these little bits of data are on the surface when aggregated and structured they can provide tremendous portraits of not just what everyone is doing online but also what you personally are doing to the people who gather that data. This data ranges from your IP Address which can show your location and uniquely identifies you, to bounced emails, to the background information stored on your phone (i.e. your metadata). To get a better understanding of what your metadata looks like a German newspaper Die Zeit gathered six months of cell phone metadata and compiled that information into a visual representation. A large way you are generating exhaust is through your browser and what settings you use, cookies you have, and data you broadcast. The EFF offers a site were you can see how incognito (or not so incognito) you are through some of the data you broadcast at panoptic. So naturally two questions arise,
    If you are generating all of this data which seems like it can be worth a lot of money, who owns it?
    How do you control what data is being is being collected about you and analyzed?
    We will address the first point in the next section but as to the second point aside from giving up on technology and becoming amish there is little you can do to control what data is being collected. However, while you can't stop people from collecting your data you can manage the exhaust you generate from using smart devices and the internet. The first step in this process is to be aware in what data you are revealing about yourself when you are online (including metadata) and whether there is anything about the data you are putting online that would bother you if people analyzed especially maliciously. While most of the time people who are analyzing metadata are only trying to improve their service and thereby improve your experience there are also people out there who will be able to derive surprising amounts of information about you and your life from the data you give off. Your social network generates a lot of really useful data but the task is still up to you to manage that data if there is something you would rather be kept private. One way to do this while still remaining on a social network is to intersperse deceptive information into the data along with truthful information, that way the data people are collecting about you becomes less reliable. Of course if everyone did this the value of the data analysis of your social network would become less valuable so you should weigh what you are gaining with what you are giving up before corrupting your online social network data. Another way of managing your online presence and IP is through a VPN or encrypting all of your data through the TOR network. While this won't protect information you post publicly it will hide some of the exhaust you create while browsing. A good list of ideas of how to protect your privacy can be found on eff.org's web site here.
    While we have just spent a long time explaining ways to protect your digital exhaust it is important to remember that not all exhaust is bad and not all people analyzing exhaust are bad. Most of the time the people who analyze your exhaust are doing it to improve your experience but since the exhaust starts out being created by you, you should also be aware of what you are creating and how it is being/ can be used.

    What is data ownership? - Linda
    According to techopedia.com, data ownership is defined as “the act of having legal rights and complete control over a single piece or set of data elements” (Technopedia, Data ownership). Being in control over data means having the ability to access, create, modify, share, sell, or remove data as well as having the right to give these privileges to trusted others (Lohsin, 2001). Assigning data ownership is not always easy and as described in Lohsin (2001), there are multiple paradigms of how to define the owner of data.
    (view changes)
    2:00 pm
  4. file archiveorg.png uploaded
    1:50 pm

More