Online Contributions

This module has been challenging for me since I walked in to the first class basically computer illiterate! Out of all the online networking options that we were given, I had previously used none. So I started slow and set up my blog at the beginning of the semester but it took me a while to get the hang of it. I also set up a twitter but unfortunately I found it slightly confusing and most weeks forgot I had it,that meant I did not actually use it in the end as none of the notifications came to my main email or my phone. For that reason I decided to stick to my blog.

Being a final year student I was preoccupied with various things but attempted to incorporate work I had done in the past with interests of mine on my blog. I tried different visuals out such as images and youtube clips, and used a link a couple of times that would take someone to an article etc. There has not been any discussion or posts via studynet so I didn’t post anything on their, although looking back I could’ve posted updates on there for when I had added a new post to my blog so that people could have a look. I did not have anyone’s blog so I didn’t make any comments on theirs; therefore I focused on my posts.

Overall I have definitely learnt a lot and feel slightly more confident in my computer skills, I would love to try and continue my blog but I think I would make it less about history and more on the creative side. As for twitter I cannot see myself getting used to that, but maybe if I dedicated some more time to it I would become comfortable with it. I do believe this module has helped me understand how history is created online and the importance of digitalised history. 

What is datamining, and does it encourage the creation of a specific kind of history?

v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}

Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. It can be used in many different areas and fields such as history and businesses. For those who cannot grasp the computer technology side of things such as myself, here is an example of an easy way to see data mining in action in everyday life, in particular businesses. For example, one Midwest grocery chain used the data mining capacity of Oracle software to analyse local buying patterns. They discovered that when men bought diapers on Thursdays and Saturdays, they also tended to buy beer. Further analysis showed that these shoppers typically did their weekly grocery shopping on Saturdays. On Thursdays, however, they only bought a few items. The retailer concluded that they purchased the beer to have it available for the upcoming weekend. The grocery chain could use this newly discovered information in various ways to increase revenue. For example, they could move the beer display closer to the diaper display. And, they could make sure beer and diapers were sold at full price on Thursdays.[1] Data mining consists of five key elements. Analyse the data by application software, store and manage the data in a multidimensional database system, present the data in a useful format, provide data access to business analysts and ICT professionals and extract/transform/load transaction data onto the data warehouse system. Data mining shows the difference between methodologies such as ‘keyword’ searches, which highlights a specific piece of data (a word), compared to highlighting information through the Semantic Web; An implied  meaning within the results.[2] This post will look at different methods such as Ngram and Topic Modelling and will evaluate how data mining is presented, as well as showing how historians interpret them.   

The purpose of the data mining is to extract useful knowledge from the data, and to put that knowledge to beneficial use. Data mining consist of many tools which analyse information from different perspectives. It is used particularly to compress large databases which span across different fields.[3]Data mining techniques can be used to filter many variables to a vital few to build or improve predictive models. Specific examples are provided in four categories: classification, regression, clustering, and association. One classification technique is a tree. In a tree, the data mining tool begins with a pool of all cases and then gradually divides and subdivides them based on selected variables. The tool can continue branching and branching until each subgroup contains very few (maybe as few as one) cases[4]. A limit is needed to prevent ‘overfitting;’ where categories divide repeatedly leaving as little as one factor within it, this is one of the dangers and disadvantages in the decision tree methodology. For analytical evaluation the tree primarily highlights key variables[5]. Text mining’s use for historians or researchers is debatable. The algorithms would create more results in searching for a word, yet the relevance of the search may not always be practical because words may have more than one significant connotation; meaning some results will be unrelated to the question.

 

Image

Topic Modeling is selecting topics from a body of data, Beli mentioned ‘Wikipedia.’ Then by selecting a file you can connect the topics, which means ‘annotating’ the file with the use of algorithms to locate different topics.[6]  This is equivalent to the classification process in data mining but by using a different method.

Stepwise regression is a type of multivariate regression in which variables are entered into the model one by one, and meanwhile variables are tested for removal. It can be a good model to use when supposedly independent variables are correlated. Stepwise regression is one of the techniques that can help thin out the forest and find important predictive factors[7].  Despite this useful tool, humanities resources are equipped to function without it. This does not mean that they are equipped to deal with the ‘black box’ problem in data mining.[8] Unfortunately, the ‘black box’ problem is when some output data does not correspond with the input data and thus presents unsatisfactory results. In some ways it is similar to the Bayesian system from the previous step of clustering, when the output data is not relevant to the input data on some occasions. This is particularly impractical for historians.            

 

Image

 

Cluster techniques detect groupings in the data. We can use this technique as a start on summarization and segmentation of the data for further analysis. Two common methods for clustering are K-Means and hierarchical. K-Means iteratively moves from an initial set of cluster centres to a final set of centres. Each observation is assigned to the cluster with the nearest mean. Hierarchical clustering finds the pair of objects that most resemble each other, then iteratively adds objects until they are all in one cluster.[9] .  Historians have credited the Inverse Document Frequency (IDF) by highlighting the experimental nature. However, when it is amalgamated with the Frequency of the Term (TF) it is advance in its, ‘text retrieval into methods for retrieval of other media, and into language processing techniques for other purposes.[10]’ This proves to be a good way of correlating data to provide maximum results.         

 

Image

 

Association examines correlations between large numbers of quantitative variables by grouping the variables into factors. Each of the resulting factors can be interpreted by reviewing the meaning of the variables that were assigned to each factor. One benefit of association is that many variables can be summarized by just a few factors.[11]

Image

Google N-Gram is was created so that people can visualise the rise and fall of particular keywords across 5 million books and 500 years, and has so far covered about 4% of all those books ever published.  From the rise and fall of the information displayed on the graph obvious correlations can be seen, but further interpretation seems quite difficult to deduce[12]. In searching a specific question for example, when was the Cold War? The words ‘Cold’ cross referenced against ‘War’ would analyse each words as separate entities, counting how many times they were mentioned rather than the significance the words have together.  Aiden built a software tool called the n-grams viewer to chart the frequency of phrases across a corpus of 500 billion words. A ‘one-gram’ plots the frequency of a single word such as ‘feminism’ over time; a ‘two-gram’ shows the frequency of a contiguous phrase, such as ‘touch base’. Mathew Hurst also analysed it from a language perspective and compared words from different versions of English, American English and British English, to see how they had changed over the years, which was very little. He also compared the same word but by one beginning with a capital letter and the other one with a lower case letter. However advanced this may seem, some humanities researchers in the traditional camp complain that their field can never be encapsulated by the frequency charts of words and phrases produced by the n-grams tool.[13]Other scholars have deep reservations about the digital humanities movement as a whole — especially if it will come at the expense of traditional approaches. “You can’t help but worry that this is going to sweep the deck of all money for humanities everywhere else,” says Anthony Grafton, a historian at Princeton[14].

Image

 

It can be argued that digitalising history has ushered in a new era through the way in which information is researched. Tim Hitchcock and some of his colleagues consider historians who discredit this type of history as fairly old fashioned and isolated people within archives[15].

Data mining methodologies aided historians by categorising and producing new and almost unthought-of perspectives behind the information displayed. In relation to data mining, it could be argued that digital history is easier and more precise for research. On the other hand, books were the original form of information that cannot replace personal interaction and sentiment that one feels with the document. Many historians have been excited about the new digital ways of research and Google N-Gram. N-Gram aim to release new data as soon as it can be compiled. In addition, unlike text-mining tools like COHA, Google Ngrams is multilingual. For the first time, historians working on Chinese, French, German, and Spanish sources can do what many historians have been doing for some time[16]. Overall data mining is a combination of large observational data sets to find unsuspected relationships and to summarize the data in novel ways that are both understandable and useful for the data owner. Data mining products are taking the industry by storm. The major database vendors have already taken steps to ensure that their platforms incorporate data mining techniques. In a historical sense there is still some improvements to be made and not all historians may be won over by this new revolution but it is the future and has definitely encouraged new creation of different forms of history.


[2] Fabio Ciravegna, Mark Greengrass, Tim Hitchcock, Sam Chapman, Jamie Mc Laughlin and Ravish Bhagdev, haystack, pp65-67

[3] Data Mining: What is Data Mining?, http://www.anderson.ucla.edu/faculty/jason.frand/teacher/technologies/palace/datamining.htm; consulted 10 April 2013

[5] Data mining for process improvement, http://www.crosstalkonline.org/storage/issue-archives/2011/201101/201101-Below.pdf; consulted 10 April 2013

[6] Topic models, http://videolectures.net/mlss09uk_blei_tm/; consulted 10 April 2013

[8] Fabio Ciravegna, Mark Greengrass, Tim Hitchcock, Sam Chapman, Jamie Mc Laughlin and Ravish Bhagdev, haystack, pp.67-78

[10] Stephen Robertson, Understanding Inverse Document Frequency: On theoretical arguments for IDF, Journal of Documentation, 60 no. vol. 5, pp. 503–520

[11] ibid

[12] Sapping Attention, http://sappingattention.blogspot.co.uk/; consulted 10 April 2013

[14] Ibid

[15] With Criminal Intent, http://criminalintent.org/; consulted 10 April 2013

Digital Histories Critique

Image

I decided to explore the National Archives Cabinet Papers 1915-1982. The project followed PRINCE2 methodology. This approach is widely used within Government and also the private sector. The new resource had been digitised from microfilms held at The National Archives that contain a large collection of volumes of documents.  This source was digitalised and funded by JISC as part of the JISC Digitisation Programmes. JISC supports UK further & higher education and research by providing leadership in the use of ICT, and JISC receives funding from all the UK further and higher education funding councils. For someone who is fairly computer alliterate such as myself, I found the website to be pretty accessible and simple to use which made it easy to navigate through the website. The design of the website includes a number of accessibility features such as; the ability to adjust text sizes, access keys, transcripts for video content etc. The technical standards they work to are XHTML 1.0 Transitional, CSS 2.0, and The UK Government Guidelines for websites and Web Content Accessibility Guidelines 1.0 – they work to meet the minimum double-A checkpoints. The National Archives was committed to user centered design, which means involving users at all stages of their design work, in order to make it accessible to all.

There are various ways to locate the cabinet papers, one of which is the search engine. Very basic, enter a keyword and a date which should take you to a list of papers that mention the word, or were written on that date. However there is also an advanced search where not only the keyword and date is needed but also phrases, full text or description and what type of paper you want to find e.g. Memoranda, conclusion, notebooks etc; with an explanation of what each type of paper is. The search is provided via the free text and metadata search facility, in combination with the browse sections of the website. The National Archives has used the IDOL 7.0 search engine supplied by Autonomy to power searching of the Cabinet Papers. This software provides the facility for both free-text and metadata based searching. Testing of the service was used to ensure that the free text searching was effective when used in conjunction with OCR of PDF scanned images of the Cabinet Papers features. I think the advanced search incudes all that is needed, as I found a limitation to the basic search. When a keyword is typed in such as ‘world war two’, the results bring up all documents that have the words either ‘world’, ‘war’ or ‘two’, so the advanced search is typically needed as you can type it in as a phrase instead of a keyword, and hopefully find what you are looking for. The other way to search for a cabinet paper is by browsing through a theme. There are three main themes which have sub-titles within them which takes you to the topic webpage with a description of that topic, and then within that webpage there are more sub-titles that link to various cabinet papers on that topic or event. Although the process is a little long winded, I think it is purposely broken down. This sight is not just for university level students but college students as well that start from the age of sixteen, therefore this website needs to be useable by them.

This website uses Optical Character Recognition (OCR) technology. The ease of access that the OCR provides to the documents is of great value to research, however this project has also discovered the limitations of OCR in its inability to recognise images, foreign text etc. and the use of manually transcribed metadata is useful here in ensuring access to the document. New processes have been developed for Quality Assurance (QA) and correction of OCR and for managing document releases within the collection. A new database had to be created which linked the references and information in Documents Online to the XML file references. The cabinet papers are all on a PDF file, scanned onto the sight. I like this as it shows the original paper and provides the authenticity people are looking for. Not only does the sight provide cabinet papers but it also has a whole section along with sub sections about cabinet and how it works, how the records work and the development over the years in quite some depth. What I have found about this website is that it covers a lot of ground but then also provides further reading for people who want to research more. This leads me onto the writing frame section of the website.

This section of the website makes it more interactive and is an encouraging way of getting students to do extra research or to do an independent project of their own. The Writing Frame is an interactive software tool designed to support your use of primary historical resources, there are already A level research projects that people can build on and add to or they can start their own using their template. The last use of this sight is the ‘maps in time’ feature which a nice touch. This is to provide users of the geographical locations of places mentioned in the cabinet papers that may be unfamiliar. The map Is a great way to visualise where certain events took place, this is done by simply clicking to zoom in on a particular country and clicking a time period on the top of the map. This brings up little red information dots which you can hover over and find out information about that time and place. It spans from 1900-200 so it exceeds the years of the cabinet papers provided which it a good resource to go and explore the places and the history further. There is an alternative map that is non-flash player which is good for apple mac users who cannot watch many things on flash; this version is in a PDF file. This version is less interactive as you are unable to click and navigate around the map yourself; this causes some restrictions to this version. Due to it being a snap shot of the map screen, the whole map can’t be shown at once so the countries that are not visible, they write about without a picture or image; this could confuse people who are unaware of where the country is. The material featured on this website is subject to Crown copyright protection and licensed for use under the Open Government Licence unless otherwise indicated. You may use and re-use Crown copyright information from this website (other than the Royal Arms and departmental or agency logos) free of charge in any format or medium, under the terms and conditions of the Open Government Licence, provided it is reproduced accurately and not used in a misleading context.

Overall the website is quite well put together, a great way for students and researchers to get information with minimal limitations.

Lincholn Official Trailer

There have been many many many films based on things that have happened in the past, i call these historic movies. There are the ones that create a fictional story around a past event or person such as ‘Troy’ or ‘Boy in the striped pajamas’ which are highly entertaining, but this film seems to be different. This captures Lincoln as a person and is based only on him and his struggle to stop the slave trade. This interests me as in college i wrote a 4000 word essay on Lincoln and the Emancipation Proclamation, so i look forward to seeing how accurate and entertaining this is. It has great reviews and won a few Oscars so it should be a good watch! 🙂

America vs England

I love studying history, i find it fascinating! My favourite area of study is modern American history as i find it most interesting! I am lucky enough to be visiting America for the first time in May, and since booking the trip i have been fantasising about what it will be like over there and what experience i will have. Whilst the last few weeks my mind has been occupied with thoughts of my trip, this week i started thinking about the differences between America and the place i was born and raised in…England. I decided to write a poem cappturing the spirit of England….

Tea, scones and crumpets a delicacy over here,

You alright love?  One of the many ways to greet someone.

The patriotic nature in conjunction with the footy,

Pubs roaring as England come together to win,

London’s burning as rioters protest to be heard

Where’s our queen when we need her most.

Rainy days overshadow the brighter ones,

Towns and cities congested with people in a world

Of their own and minds on their problems,

Beaches and countryside’s are where you want to be,

Not lost in a jungle of multiculturalism and differences.

Is that a good or a bad thing

This is England, it is what it is.

Employability and Creative Writing, Mimi Thebo

Employability and Creative Writing, Mimi Thebo

^^^^^^^^ Click this link…..

I came across this article and had to share it! It was a great insight into the current attitudes and challenges faced by creative writing students or graduates! it got me thinking about what i feel i have gained from studying it and why i took it in the first place!

Creative Writing has been an extremely valuable subject in terms of the skills and knowledge it has equipped me with to take and use in my future field of work, which will hopefully be public relations. Not only does it give me the confidence to explore my creative mind, through writing a variety of different texts and genres for various different audiences, but it has helped me blossom in my writing.

Studying creative writing has not only given me the academic skills required in the workplace such as being able to write cohesively, and having a knowledge and understanding of written work, but the social skills that are wanted from an employee in an establishment. Working in groups is a large part of the course, this has taught me how to work well with others as well as listen and give criticism and praise in the appropriate manner. Giving presentations and speaking alongside reading my work to the class, gives me confidence and makes me aware of how to communicate with an audience as well as keep them engaged, and formally present my work. Working in PR is highly competitive and definitely involves having a creative mind, which I think I possess.

I have loved studying creative writing and would recommend it to anyone that wants to explore their creativeness and further their knowledge on the craft. Reading various types of writing and being able to write several different forms of writing confidently, enables me to adapt to different job roles and be aware of how to write, edit and communicate in certain establishments.

Overall I strongly believe creative writing has opened many doors for me and many fields of work I can explore in the future, providing me with all the skills mentioned along with giving me the confidence to articulate opinions, and improve products. I am extremely pleased that I studied the subject and look forward to the working world after university.

First Post

This is my first post, in my blog i aim to share my findings and views in regard to my History with Creative Writing degree. Each week i find something new and interesting relating to my course and i will post these things on here as a way to communicate my insights! 🙂 Hope you enjoy! x