Jinfo BlogDetecting online plagiarism

Wednesday, 8th August 2012 Sign in to MyJinfo or create an account be able to star items Click for printable version Subscribe via RSS to get updates as soon as Blog items are added Tweet about this item on Twitter Share on Facebook Share on LinkedIn

By Arthur Weiss

« Blog

Abstract

Every organisation that publishes content on the web needs to be aware of the potential for plagiarism and how to stay alert to it. Plagiarism can risk your organisation's reputation, as well as put revenue at risk.

Item

Plagiarism involves using somebody else’s work and claiming authorship without crediting the original author. It differs from copyright infringement where permission to publish was not obtained. Copyright infringement may or may not involve the misattribution of authorship.

Online plagiarism (or website plagiarism) is the copying of website or blog content and passing it off as original material on another website. Generally this also involves a breach of copyright. The problem for content owners is detecting online plagiarism and stopping it.

Detecting plagiarised content can be hard work. Nevertheless there are approaches that can catch obvious plagiarists.

One approach is to search for unique phrases using Google or other search engine. As an example, our website includes the phrase “No business is an island”. Google lists over 100,000 hits for this phrase. However a few sentences later we use the phrase “businesses are at war”. Adding this to the search gave 20 or so results almost all of which came from sites that took the material from our website. This can be automated using Google Alerts, which ensures that notification when new plagiarised content appears.

One approach to discover plagiarisers is to use a dedicated tool such as Copyscape. With Copyscape, you enter in the URL to be checked and the site returns a list of up to 10 sites that have copied content in the free version and an unlimited number in the paid version (which allows a batch search for a complete website). Copyscape also offers an alerting service. TinEye is a similar type service that can be used to find copied images.

Identifying plagiarism is the initial step. Getting the plagiariser to remove the copied content is harder. Firstly, you need proof that the content has been copied and that your content pre-dated the copied content. (Sites like the Wayback Machine at Archive.org are useful for this).

Contacting plagiarisers can sometimes have results and the copied content will be removed. More usually follow-up actions are required. The host provider can be informed that the site is infringing copyright as can search engines who can be asked not to include the infringing pages in their indexes. The final recourse is legal and this can be expensive. Ultimately the approach taken should be based on a decision relating to the threat posed by the copied material and the effort required to get it removed.

This is a short version of a longer article on the same topic available as part of the FreePint Subscription. The longer article discusses in more detail how to identify and protect against plagiarism, as well as the risks an organisation faces by ignoring it. Subscribers can log in to view it now.

« Blog

Benefit from our research

Content and Community

Connect your team with the practical tools, original research and expertise to build and support information strategy in your organisation.

A Jinfo Subscription gives access to all Content (articles, reports, webinars) and Community.

Subscription benefits


Consulting

Our proven processes, resources and guidance will help your team make the shift from transaction centre to strategic asset.

Read case studies, and start the conversation.

Consulting benefits