http://www.papermachines.org/wiki/api.php?action=feedcontributions&user=CJR&feedformat=atomPaper Machines Wiki - User contributions [en]2024-03-29T09:09:34ZUser contributionsMediaWiki 1.22.6http://www.papermachines.org/wiki/page/User:PsheldUser:Psheld2016-05-24T19:56:34Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>Engineer. Architect @hiproj. Managing Partner @eulerpartners. Author @attenzi & 'The Business of Influence'. Board Director @techUK. Director @netsocproj.</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:PsheldUser talk:Psheld2016-05-24T19:56:34Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 12:56, 24 May 2016 (PDT)</div>CJRhttp://www.papermachines.org/wiki/page/User:CuchulainUser:Cuchulain2016-04-19T16:04:15Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>Just wanted to experiment on my collection of business books.</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:CuchulainUser talk:Cuchulain2016-04-19T16:04:15Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 09:04, 19 April 2016 (PDT)</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:AazUser talk:Aaz2016-04-19T16:03:50Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 09:03, 19 April 2016 (PDT)</div>CJRhttp://www.papermachines.org/wiki/page/User:AazUser:Aaz2016-04-19T16:03:49Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>I wish to use Paper Machines term frequency - inverse document frequency engine to automatically tag my Zotero collection.</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:NelangstUser talk:Nelangst2015-04-07T17:29:27Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 10:29, 7 April 2015 (PDT)</div>CJRhttp://www.papermachines.org/wiki/page/User:NelangstUser:Nelangst2015-04-07T17:29:26Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>I am working on a project to compare forestry discourses in Sweden and the US during different time period</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:JoguldiUser talk:Joguldi2015-01-30T18:27:26Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 10:27, 30 January 2015 (PST)</div>CJRhttp://www.papermachines.org/wiki/page/User:JoguldiUser:Joguldi2015-01-30T18:27:25Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>I am Jo Guldi, historian of infrastructure at Brown University.</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:MkempnerUser talk:Mkempner2015-01-30T18:27:09Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 10:27, 30 January 2015 (PST)</div>CJRhttp://www.papermachines.org/wiki/page/User:MkempnerUser:Mkempner2015-01-30T18:27:08Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>I like to read books, study history, and learn Portuguese.</div>CJRhttp://www.papermachines.org/wiki/page/User_talk:TestAccountUser talk:TestAccount2014-11-23T16:08:59Z<p>CJR: Welcome!</p>
<hr />
<div>'''Welcome to ''Paper Machines Wiki''!'''<br />
We hope you will contribute much and well.<br />
You will probably want to read the [[https://www.mediawiki.org/wiki/Special:MyLanguage/Help:Contents|help pages]].<br />
Again, welcome and have fun! [[User:CJR|CJR]] ([[User talk:CJR|talk]]) 08:08, 23 November 2014 (PST)</div>CJRhttp://www.papermachines.org/wiki/page/User:TestAccountUser:TestAccount2014-11-23T16:08:58Z<p>CJR: Creating user page for new user.</p>
<hr />
<div>This is a test account to confirm that account request works.</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-22T21:34:42Z<p>CJR: Protected "Main Page": High traffic page ([Edit=Allow only administrators] (indefinite) [Move=Allow only administrators] (indefinite))</p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[Getting Started]]</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-22T21:27:15Z<p>CJR: Reverted edits by AracelyV55 (talk) to last revision by CJR</p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[Getting Started]]</div>CJRhttp://www.papermachines.org/wiki/page/Working_with_JSTOR_Data_for_ResearchWorking with JSTOR Data for Research2014-11-18T19:44:01Z<p>CJR: </p>
<hr />
<div>[[http://dfr.jstor.org/ JSTOR Data for Research]] provides basic bibliographic information and word counts for a large sample of the texts from their collection. This data can be fed into machine learning algorithms such as the topic models included in [[http://mallet.cs.umass.edu/topics.php MALLET]]. At present, only the topic modeling functionality within Paper Machines supports DfR input.<br />
<br />
To obtain and analyze a DfR dataset:<br />
<br />
# You must first [[http://dfr.jstor.org/accounts/register/ register]] for a DfR account. You can then search for articles based on keywords, years of publication, specific journals, and so on. Note that if your query returns more than 1,000 results, you will receive a random sample of 1,000 documents.<br />
# Once the query has been refined to your liking, go to the Dataset Requests menu at the upper right and click "Submit New Request."<br />
# Check the "Citations" and "Word Counts" boxes, select CSV output format, and enter a short job title that describes your query. For example: [[File:Jstor_dfr_options.png]]<br />
# Once you click "Submit Job", you will be taken to a history of your submitted requests. You will be e-mailed once the dataset is complete.<br />
# Click "Download (#### docs)" in the Full Dataset column, and a zip file timestamped with the request time will be downloaded. This file (or several files with related queries, e.g several searches divided up by decade) may then be incorporated into a model.<br />
# Paper Machines typically operates only on Zotero collections with full-text documents. In order to use DfR datasets, the easiest method is to create a new folder in Zotero and add one empty text note, then do "Extract Text" from the Paper Machines context menu. This will in effect "trick" the software into thinking you have a suitable full-text collection for analysis.<br />
# Once the collection is extracted, you can create a topic model by opening the context menu and selecting Topic Modeling -> "By Time (With JSTOR DFR)." If you select multiple zip files, they will be merged and duplicates discarded before analysis begins.<br />
<br />
Be warned, the analysis may take a considerable amount of time before it begins to show progress (~15-30 minutes).</div>CJRhttp://www.papermachines.org/wiki/page/File:Jstor_dfr_options.pngFile:Jstor dfr options.png2014-11-18T19:40:10Z<p>CJR: </p>
<hr />
<div></div>CJRhttp://www.papermachines.org/wiki/page/Working_with_JSTOR_Data_for_ResearchWorking with JSTOR Data for Research2014-11-18T19:39:17Z<p>CJR: Created page with "http://dfr.jstor.org/ JSTOR Data for Research provides basic bibliographic information and word counts for all the texts from their collection. This data can be fed into m..."</p>
<hr />
<div>[[http://dfr.jstor.org/ JSTOR Data for Research]] provides basic bibliographic information and word counts for all the texts from their collection. This data can be fed into machine learning algorithms such as the topic models included in [[http://mallet.cs.umass.edu/topics.php MALLET]]. At present, only the topic modeling functionality within Paper Machines supports DfR input.<br />
<br />
To obtain a DfR dataset, you must first [[http://dfr.jstor.org/accounts/register/ register]] for a DfR account. You can then search for articles based on keywords, years of publication, specific journals, and so on. Note that if your query returns more than 1,000 results, you will receive a random sample of 1,000 documents.<br />
<br />
Once the query has been refined to your liking, go to the Dataset Requests menu at the upper right and click "Submit New Request."<br />
<br />
Check the "Citations" and "Word Counts" boxes, select CSV output format, and enter a short job title that describes your query. For example:<br />
<br />
<br />
Once you click "Submit Job", you will be taken to a history of your submitted requests. You will be e-mailed once the dataset is complete.<br />
<br />
Click "Download (#### docs)" in the Full Dataset column, and a zip file timestamped with the request time will be downloaded. This file (or several files with related queries, e.g several searches divided up by decade) may then be incorporated into a model.<br />
<br />
Paper Machines typically operates only on Zotero collections with full-text documents. In order to use DfR datasets, the easiest method is to create a new folder in Zotero and add one empty text note, then do "Extract Text" from the Paper Machines context menu. This will in effect "trick" the software into thinking you have a suitable full-text collection for analysis.<br />
<br />
Once the collection is extracted, you can create a topic model by opening the context menu and selecting Topic Modeling -> "By Time (With JSTOR DFR)." If you select multiple zip files, they will be merged and duplicates discarded before analysis begins.<br />
<br />
Be warned, this may take a considerable amount of time before it begins to show progress (~15-30 minutes).</div>CJRhttp://www.papermachines.org/wiki/page/Getting_StartedGetting Started2014-11-18T19:21:09Z<p>CJR: /* Advanced Paper Machines Skills: JSTOR Data For Research */</p>
<hr />
<div>== Installation ==<br />
<br />
In order to run Paper Machines, you will need:<br />
<br />
* [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Preferences)<br />
* a corpus of documents with full text PDF/HTML and metadata<br />
* Java 6 or higher ([http://java.com/en/download/index.jsp download page])<br />
<br />
The latest version of the software is available at [http://papermachines.org/install http://papermachines.org/install].<br />
<br />
== Basic Tips for Using Paper Machines ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Paper Machines is not a standalone application. It runs in Zotero, so you must install Zotero first. Go to http://zotero.org and install the standalone version of Zotero (i.e. not the plugin for Firefox or Chrome, although you can install those as well if you like)<br />
* Paper Machines runs on full-text, ocr'd pdfs that are attached to full-annotated citations in Zotoro with good metadata.<br />
** full-text ocr'd pdf's are a kind of document where the computer can recognize the text. if you cannot highlight words in a pdf, it has not been ocr'd. you can find ocr software online. ocr'd pdf's are frequently available for download on services like JSTOR.<br />
** good metadata means that your citations in Zotero have author, title, date of publication, and place of publication, and are coded as books, articles, or chapters. <br />
** Paper Machines can run on HTML "snapshot" attachments (saved web pages with the icon of a camera), but make sure that these contain the full text you expect -- they may not download properly with some databases. It also reads text notes and plain text documents, configurable through the context menu -> "Paper Machines Preferences..."<br />
<br />
== Basic Troubleshooting ==<br />
<br />
* update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
** after "extracting texts" the first time, you must click on another folder to allow Paper Machines to refresh. then return to the folder that you were working on. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
<br />
== Curating Your Corpus for Better Analysis == <br />
<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.<br />
<br />
== Advanced Paper Machines Skills: JSTOR Data For Research ==<br />
* to ask larger questions about historiography in the humanities and social sciences, it is sometimes useful to run Paper Machines on many articles at once. Paper Machines is built to work with JSTOR, the nonprofit article repository that covers much of the publication in the humanities and social sciences over the twentieth century. <br />
* you can download datasets from JSTOR data for research: http://dfr.jstor.org<br />
* for more detailed instructions, see [[Working with JSTOR Data for Research]]<br />
<br />
== Common Error Messages ==<br />
<br />
"The process log is displayed below. Refreshing status in 15 seconds.<br />
No log file found."<br />
* this is not actually an error message. pardon our mess. the processor is still running the algorithm. it will take some time. let the screen continue to refresh. <br />
* depending on the function you have asked the computer to perform, the speed of your computer, and the size of the pdf's, it make take some time to execute these commands. <br />
* if necessary, tell your computer not to go to sleep and leave it alone for 15 minutes to 15 hours. it will eventually run.<br />
<br />
"ERROR:root:pdftotext not found"<br />
* You may need to install the software dependency pdftotext. Zotero can download and install it for you: go to the gear icon in the toolbar -> Preferences... -> the Search tab -> click "Check for installer."<br />
<br />
== Joining In, Helping Out ==<br />
<br />
Paper Machines is open-source software cobbled together by minimal research funds. Both the software and its documentation depend on People Like You!<br />
* please consider contributing your best practices, successes, and error messages to this page. If content is hard to understand, please help us clarify. Editing the wiki requires a login.<br />
* If you are a computer scientist or a programmer who has developed tools like MALLET for analyzing large pieces of text, please consider making a connector between Zotero and your tool, posting it on Github, and letting us know so that it can be added to the Paper Machines suite of tools.</div>CJRhttp://www.papermachines.org/wiki/page/Getting_StartedGetting Started2014-11-18T19:20:35Z<p>CJR: /* Advanced Paper Machines Skills: JSTOR Data For Research */</p>
<hr />
<div>== Installation ==<br />
<br />
In order to run Paper Machines, you will need:<br />
<br />
* [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Preferences)<br />
* a corpus of documents with full text PDF/HTML and metadata<br />
* Java 6 or higher ([http://java.com/en/download/index.jsp download page])<br />
<br />
The latest version of the software is available at [http://papermachines.org/install http://papermachines.org/install].<br />
<br />
== Basic Tips for Using Paper Machines ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Paper Machines is not a standalone application. It runs in Zotero, so you must install Zotero first. Go to http://zotero.org and install the standalone version of Zotero (i.e. not the plugin for Firefox or Chrome, although you can install those as well if you like)<br />
* Paper Machines runs on full-text, ocr'd pdfs that are attached to full-annotated citations in Zotoro with good metadata.<br />
** full-text ocr'd pdf's are a kind of document where the computer can recognize the text. if you cannot highlight words in a pdf, it has not been ocr'd. you can find ocr software online. ocr'd pdf's are frequently available for download on services like JSTOR.<br />
** good metadata means that your citations in Zotero have author, title, date of publication, and place of publication, and are coded as books, articles, or chapters. <br />
** Paper Machines can run on HTML "snapshot" attachments (saved web pages with the icon of a camera), but make sure that these contain the full text you expect -- they may not download properly with some databases. It also reads text notes and plain text documents, configurable through the context menu -> "Paper Machines Preferences..."<br />
<br />
== Basic Troubleshooting ==<br />
<br />
* update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
** after "extracting texts" the first time, you must click on another folder to allow Paper Machines to refresh. then return to the folder that you were working on. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
<br />
== Curating Your Corpus for Better Analysis == <br />
<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.<br />
<br />
== Advanced Paper Machines Skills: JSTOR Data For Research ==<br />
* to ask larger questions about historiography in the humanities and social sciences, it is sometimes useful to run Paper Machines on many articles at once. Paper Machines is built to work with JSTOR, the nonprofit article repository that covers much of the publication in the humanities and social sciences over the twentieth century. <br />
* you can download datasets from JSTOR data for research: http://dfr.jstor.org<br />
* for more info, see [Working with JSTOR Data for Research]<br />
<br />
== Common Error Messages ==<br />
<br />
"The process log is displayed below. Refreshing status in 15 seconds.<br />
No log file found."<br />
* this is not actually an error message. pardon our mess. the processor is still running the algorithm. it will take some time. let the screen continue to refresh. <br />
* depending on the function you have asked the computer to perform, the speed of your computer, and the size of the pdf's, it make take some time to execute these commands. <br />
* if necessary, tell your computer not to go to sleep and leave it alone for 15 minutes to 15 hours. it will eventually run.<br />
<br />
"ERROR:root:pdftotext not found"<br />
* You may need to install the software dependency pdftotext. Zotero can download and install it for you: go to the gear icon in the toolbar -> Preferences... -> the Search tab -> click "Check for installer."<br />
<br />
== Joining In, Helping Out ==<br />
<br />
Paper Machines is open-source software cobbled together by minimal research funds. Both the software and its documentation depend on People Like You!<br />
* please consider contributing your best practices, successes, and error messages to this page. If content is hard to understand, please help us clarify. Editing the wiki requires a login.<br />
* If you are a computer scientist or a programmer who has developed tools like MALLET for analyzing large pieces of text, please consider making a connector between Zotero and your tool, posting it on Github, and letting us know so that it can be added to the Paper Machines suite of tools.</div>CJRhttp://www.papermachines.org/wiki/page/Getting_StartedGetting Started2014-11-16T19:24:46Z<p>CJR: </p>
<hr />
<div>== Installation ==<br />
<br />
In order to run Paper Machines, you will need:<br />
<br />
* [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Preferences)<br />
* a corpus of documents with full text PDF/HTML and metadata<br />
* Java 6 or higher ([http://java.com/en/download/index.jsp download page])<br />
<br />
The latest version of the software is available at [http://papermachines.org/install http://papermachines.org/install].<br />
<br />
== Basic Tips for Using Paper Machines ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Paper Machines is not a standalone application. It runs in Zotero, so you must install Zotero first. Go to http://zotero.org and install the standalone version of Zotero (i.e. not the plugin for Firefox or Chrome, although you can install those as well if you like)<br />
* Paper Machines runs on full-text, ocr'd pdfs that are attached to full-annotated citations in Zotoro with good metadata.<br />
** full-text ocr'd pdf's are a kind of document where the computer can recognize the text. if you cannot highlight words in a pdf, it has not been ocr'd. you can find ocr software online. ocr'd pdf's are frequently available for download on services like JSTOR.<br />
** good metadata means that your citations in Zotero have author, title, date of publication, and place of publication, and are coded as books, articles, or chapters. <br />
** Paper Machines can run on HTML "snapshot" attachments (saved web pages with the icon of a camera), but make sure that these contain the full text you expect -- they may not download properly with some databases. It also reads text notes and plain text documents, configurable through the context menu -> "Paper Machines Preferences..."<br />
<br />
== Basic Troubleshooting ==<br />
<br />
* update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
** after "extracting texts" the first time, you must click on another folder to allow Paper Machines to refresh. then return to the folder that you were working on. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
<br />
== Curating Your Corpus for Better Analysis == <br />
<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.<br />
<br />
== Advanced Paper Machines Skills: JSTOR Data For Research ==<br />
* to ask larger questions about historiography in the humanities and social sciences, it is sometimes useful to run Paper Machines on many articles at once. Paper Machines is built to work with JSTOR, the nonprofit article repository that covers much of the publication in the humanities and social sciences over the twentieth century. <br />
* you can download datasets from JSTOR data for research: http://dfr.jstor.org<br />
[more instructions coming]<br />
<br />
== Common Error Messages ==<br />
<br />
"The process log is displayed below. Refreshing status in 15 seconds.<br />
No log file found."<br />
* this is not actually an error message. pardon our mess. the processor is still running the algorithm. it will take some time. let the screen continue to refresh. <br />
* depending on the function you have asked the computer to perform, the speed of your computer, and the size of the pdf's, it make take some time to execute these commands. <br />
* if necessary, tell your computer not to go to sleep and leave it alone for 15 minutes to 15 hours. it will eventually run.<br />
<br />
"ERROR:root:pdftotext not found"<br />
* You may need to install the software dependency pdftotext. Zotero can download and install it for you: go to the gear icon in the toolbar -> Preferences... -> the Search tab -> click "Check for installer."<br />
<br />
== Joining In, Helping Out ==<br />
<br />
Paper Machines is open-source software cobbled together by minimal research funds. Both the software and its documentation depend on People Like You!<br />
* please consider contributing your best practices, successes, and error messages to this page. If content is hard to understand, please help us clarify. Editing the wiki requires a login.<br />
* If you are a computer scientist or a programmer who has developed tools like MALLET for analyzing large pieces of text, please consider making a connector between Zotero and your tool, posting it on Github, and letting us know so that it can be added to the Paper Machines suite of tools.</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T22:10:12Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[Getting Started]]</div>CJRhttp://www.papermachines.org/wiki/page/Getting_StartedGetting Started2014-11-14T22:09:05Z<p>CJR: </p>
<hr />
<div>== Installation ==<br />
<br />
In order to run Paper Machines, you will need:<br />
<br />
* [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Preferences)<br />
* a corpus of documents with full text PDF/HTML and metadata<br />
* Java 6 or higher ([http://java.com/en/download/index.jsp download page])<br />
<br />
The latest version of the software is available at [http://papermachines.org/install http://papermachines.org/install].<br />
<br />
== Basic Tips for Using Paper Machines ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Basic advice: update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.</div>CJRhttp://www.papermachines.org/wiki/page/Getting_StartedGetting Started2014-11-14T22:05:18Z<p>CJR: Created page with "== Installation == In order to run Paper Machines, you will need: * [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Prefer..."</p>
<hr />
<div>== Installation ==<br />
<br />
In order to run Paper Machines, you will need:<br />
<br />
* [http://www.zotero.org/ Zotero] with PDF indexing tools installed (see the Search pane of Zotero's Preferences)<br />
* a corpus of documents with full text PDF/HTML and metadata<br />
* Java 6 or higher ([http://java.com/en/download/index.jsp download page])<br />
<br />
The latest version is also availabe at [http://papermachines.org/install/].<br />
<br />
== Basic Tips for Using Paper Machines ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Basic advice: update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:49:22Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[Getting Started]]<br />
<br />
[[General Tips]]</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:49:01Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[Getting Started]]<br />
[[General Tips]]</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:46:19Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[General Tips]]</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:46:09Z<p>CJR: Replaced content with "'''Welcome to the Paper Machines wiki.''' This is a space for user-contributed content related to Paper Machines. General Tips"</p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
[[General Tips]</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:45:10Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
<br />
== Basic Tips for Use ==<br />
<br />
Some quick suggestions from Jo Guldi on using the software:<br />
* Basic advice: update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:44:11Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.<br />
<br />
""Basic Tips for Use""<br />
<br />
Some quick pieces of advice from Jo Guldi on using the software:<br />
* Basic advice: update Java, restart the computer, restart Zotero; if anything crashes, restart again; give it a lot of time to do the first run. <br />
* Depending upon the speed of one's machine, it can take a long time to do the initial read of a folder -- even on my ramped up macbook pro, I typically leave it running overnight on big folders.<br />
* Getting good results depends on pre-curating the texts in Zotero so as to ask intelligent questions. For instance, Paper Machines can compare a series of folders. What should those folders be? If you want to compare French and English novels, then there should be one folder of French novels and another of English. If you want to compare seven units at the World Bank, then each unit should get a folder. If you are looking for change over time, it might make sense to divide up your texts into decades or centuries, pre-civil-war and post-civil-war, so that the two sets can be easily compared. <br />
* Also, because the topic-modeler handles texts as a bucket-of-words, better results may come from splitting up big pdf's. Break up full-text novels or whole World Bank Reports into several different documents, each a book chapter or section.</div>CJRhttp://www.papermachines.org/wiki/page/Main_PageMain Page2014-11-14T21:28:17Z<p>CJR: </p>
<hr />
<div>'''Welcome to the Paper Machines wiki.'''<br />
<br />
This is a space for user-contributed content related to Paper Machines.</div>CJR