Suggestions for dealing with Content Thieves

Discussion in 'Novel General' started by Prosperous_Food, Mar 17, 2018.

Thread Status:
Not open for further replies.
  1. Prosperous_Food

    Prosperous_Food Active Member

    Joined:
    Feb 22, 2018
    Messages:
    96
    Likes Received:
    92
    Reading List:
    Link
    Hi fellow translators and webmasters,

    Is your site hurt by a content thief? Did you discover that some other sites have been stealing your content from your site without permission? Are you upset and angry that the hard work you did for so long was just ripped off by another site?

    You are not alone!

    I started this thread to guide all hardworking translators on how to fight these content thieves. It is time we united and stood up against these content thieves. For those who want to know what can be done, I have chronicled my experience with fighting these crooks. Hope this guide is useful to all translators being plagued by this problem.

    What is content thief?

    Content thief is when someone stole your content without permission. They do it for getting visitors to their sites, and earn revenues from advertising dollars.

    [​IMG]

    [​IMG]
    [​IMG]

    Your options and the Pros and Cons of each approach.
    [​IMG]
    1. Ignore them

    This is by far the easiest approach you can take. Usually, the most popular translators would recommend this because it takes A LOT of time to fight these crooks.

    [​IMG]

    2. Attack them and protect your site.

    You need to have some knowledge of internet, including WhoIs lookup etc, and ability to block IP address if you want to protect your site from these bots. The thing is, this is hard.

    [​IMG]


    If this approach is what you want, you can start by contacting the scraper and ask them to take the content down. If they refuse to do so or simply do not reply to your requests, then you file a DMCA (Digital Millennium Copyright Act) with their host. In our experience, the majority of the scraping websites do not have a contact form available. If they do, then utilize it. The majority of these websites will ignore your requests.

    So do a Whois Lookup.

    If this is too complex for you, just use 3rd party sites that protect your website against these people. See the DMCA Takedown website at http://www.dmca.com/takedowns.aspx. This is a free service. I recommend you try it.

    When you got the Whois Lookup, what happens is that you got their IP address. So you can use this IP address to block the bots. The problem is, these bots are really cunning, and their IP address changes. I do not have much luck doing this, even with 2 firewalls, and IP blacklist. That leads me the last option

    3. Use them


    Sun Tze's art of war. If you cannot beat them, use them to your advantage. You see, the bots are just bots. They have no human brains and just do what they are programmed to do. So your job is to outsmart these bots. We are smart people

    Understand 1 thing, these bots are content scrappers (i.e. they are here to steal your content). So what happens if your content tell their readers to read the content from your website? You got the idea. We use them to attract more readers to our site. :):)

    [​IMG]

    Here, I tried the second approach, and the battle over the DMCA is still ongoing. Meanwhile, I am attracting their customers and have a massive traffic spike! As you can see from my stats in wordpress, I started doing this 2 days ago, and I got a huge spike in traffic
    [​IMG]

    Woahh... Using them is good. The problem is how to use them. For me, I add some words to my content at the top.

    [​IMG]

    So when it appears in the content scrapper sites:

    [​IMG]

    I also add some words to my content at the bottom. So when the site tries to grab content from me, what happens is that on the sites that stole my content, the following got displayed at the bottom of the page.

    [​IMG]

    Learn to mix and match so that the bots could not get more intelligent. Change your statements instead of using a constant statement.

    e.g. My site is prosperous food translations. Sometimes I write as prosperousfood dot com instead of prosperousfood.com so that the bots would not take it down. DO NOT USE links. The bots remove all urls automatically.

    And one row in the middle (invisible so that my readers on my site are not affect) to piss off their readers. There is nothing like an interrupt when you are enjoying the translations.

    [​IMG]

    As you can see from my site, the above content is not visible on my site. There is some space for the hidden contents that will appear on the site scrappers' site.

    [​IMG]


    So you are essentially killing their customers' enjoyment of the content while increasing your traffic each time they reap off a chapter from you!

    (y)(y)(y)

    If each of us does this, what will happen is that the readers may choose to ignore the content scrappers. Some will ignore your message, but there are some that will choose to read from your site.

    As long as we mix it up in the content, the bots cannot tell what is going on (they are dumb bots and not human, and hence cannot read remember?).

    Hence you get more traffic to your site while using the rats that stole your content from you. So Karma is a bitch. :LOL::LOL::LOL: Damn the bots!

    P.S. if the sites try to delete your comments by hand, they are going to have a hard time if all of us do these. Imagine having 1000+ sites to read and edit. The content scrappers are lazy. They just want to profit off you without doing any work. That's why they employ bots.

    Hope this helps all content owners and translators to deal with these content thieves.








     
    Eques, Rumby, S4TY4 and 12 others like this.
  2. Robbini

    Robbini Logical? Illogical? Random? Or Just Unique?

    Joined:
    Oct 20, 2015
    Messages:
    2,887
    Likes Received:
    1,749
    Reading List:
    Link
    One problem I notice with 3) is that if you keep on using the same way to tell readers they're reading on the wrong site, eventually they'll realize and maybe spend a few seconds each chapter removing those notices, if you use the exact same notices in each chapter in the same places.
     
    Rumby, readerz and I Eat Monsanto like this.
  3. Minokyuu

    Minokyuu ( ͡° ͜ʖ ͡°)

    Joined:
    Oct 14, 2017
    Messages:
    847
    Likes Received:
    631
    Reading List:
    Link
    Good thinking hopefully someone followed it not just ignore it, like to troll those ass-h much better than anything else :p
     
  4. Xane

    Xane Well-Known Member

    Joined:
    Aug 5, 2016
    Messages:
    1,416
    Likes Received:
    1,200
    Reading List:
    Link
    I use a browser script to change website styles (font and background colors).

    Mid-chapter texts are not invisible.
     
  5. noisypixy

    noisypixy Sacatunn que pen, que summum que tun.

    Joined:
    Jun 25, 2016
    Messages:
    716
    Likes Received:
    950
    Reading List:
    Link
    Ooh you mean like those blogs that post their links on NU?
     
  6. Prosperous_Food

    Prosperous_Food Active Member

    Joined:
    Feb 22, 2018
    Messages:
    96
    Likes Received:
    92
    Reading List:
    Link
    One line in mid-chapter and your readers are unlikely to be pissed off. Just do not overdo this. :)

    Hi, Robbini, as I said, we are humans and our advantage over the bots is that we have brains.

    So let's use these brains to our advantage ok?

    Do not use the same words all the time, and use them at the end of the chapter or beginning of the chapter all the time. Put it in text in the middle. :) Mix it up to stay one step ahead of the bots.

    You are a translator. How many ways could you list your website? :)
     
    Last edited by a moderator: Mar 17, 2018
    namige likes this.
  7. J-Mitch

    J-Mitch ⚖ Tipping the Scales of the World

    Joined:
    Mar 11, 2016
    Messages:
    1,922
    Likes Received:
    3,759
    Reading List:
    Link
    Wait, how did you do this?
     
  8. oblueknighto

    oblueknighto Blue Person

    Joined:
    Oct 20, 2015
    Messages:
    3,197
    Likes Received:
    2,239
    Reading List:
    Link
    Hope we can defeat these thieves one day.
     
  9. Drake98

    Drake98 Concerned Fan

    Joined:
    Jan 23, 2016
    Messages:
    1,553
    Likes Received:
    890
    Reading List:
    Link
    does this use knowledge of html or a kind of it?
     
  10. HnM_Pete

    HnM_Pete Well-Known Member

    Joined:
    Sep 13, 2016
    Messages:
    120
    Likes Received:
    212
    Reading List:
    Link
    Like this for example. There are also plugins for Chromium-based browsers ( like Stylebot ). Google <browser>+"custom site css", there's plenty of guides.
     
  11. Prosperous_Food

    Prosperous_Food Active Member

    Joined:
    Feb 22, 2018
    Messages:
    96
    Likes Received:
    92
    Reading List:
    Link
    No. Just require you to add some words to market your own site on each chapter. Not hard right? :)
     
  12. J-Mitch

    J-Mitch ⚖ Tipping the Scales of the World

    Joined:
    Mar 11, 2016
    Messages:
    1,922
    Likes Received:
    3,759
    Reading List:
    Link
    Ah, so he changes the design of the site himself, and that renders the content that was supposed to be invisible, "visible."

    Gotchya. I'm surprised you'd study up on a site to do that. Some sites (like mine) have changing css. Well, not all of it, but most of it is dynamic. So... that's work?
     
  13. TamaSaga

    TamaSaga Well-Known Member

    Joined:
    Oct 11, 2016
    Messages:
    1,726
    Likes Received:
    2,173
    Reading List:
    Link
    When it comes to programming a bot, you need well defined solutions. You can't just point a finger and expect to get it done. Content scrapers can easily lift the text, but you need them to make additional passes to filter out junk.

    Contrary to what you see on the front page, the stuff on the backend can get exceedingly messy. Especially when the translators use javascript and liberal html obfuscation.

    Hmm, sure you can clean it all manually. Say a ton of bean varieties are scattered on the floor. You hate lima beans and pinta beans, so you're perfectly happy leaving them there. But then you love mung beans and red beans, so you start picking them up one by one by hand.

    Why not modify a chick sorting machine to do that instead and save you time? You spend less time picking up the beans and more times enjoying the cuisine that you cooked with them.

    But that's basically what the chromium scripts are doing. Instead of keeping the bean variety count low, they're adding even more bean species to the mix so the chick sorting machine has to be adjusted properly or stuff that you hate will come through. In theory, if you add enough exceptions, you'll break the machine.
     
  14. HnM_Pete

    HnM_Pete Well-Known Member

    Joined:
    Sep 13, 2016
    Messages:
    120
    Likes Received:
    212
    Reading List:
    Link
    You can copy and tweak existing CSS with browser's developer tools. It's also possible to do global overrides. Say you want all links, on all sites, be block capitals in Comic Sans font size 20 ( because you're crazy like that ), this can be done. You hate bright backgrounds? You can set default HTML body background to black and default font color to teal or something. It'll work on most sites, because the HTML elements are mostly the same. Sure, it'll break once in a while, but nothing perfect. Some people just want to put in the work to enjoy internet exactly as they want it.
     
  15. Xane

    Xane Well-Known Member

    Joined:
    Aug 5, 2016
    Messages:
    1,416
    Likes Received:
    1,200
    Reading List:
    Link
    In Firefox I use an addon called Stylish with this code. Original did it because Light Novel Bastion had some weird shit going on with their text having shadows, it was killing my eyes.

    Code:
    @-moz-document domain(website.com) {
        body * {
            background-color: #2d3238 !important;
            color: #c1b096 !important;
            font-family: consolas bold;
            text-shadow: 0px 0px 0px #121212
        }
        #content,
        #bodyContent {
            border-color: #121212 !important;
        }
    }
    Looks like this (colors are my preference, obviously)
    [​IMG]
     
  16. J-Mitch

    J-Mitch ⚖ Tipping the Scales of the World

    Joined:
    Mar 11, 2016
    Messages:
    1,922
    Likes Received:
    3,759
    Reading List:
    Link
    Uh.. If I hadn't figured it out later, you would have just confused me more. The purpose of doing that is change the site to his preference.

    But thanks for... taking the time.

    I see. Some people are particular. I guess having a site that does light/dark mode; Enlarge text; overlay backgrounds...etc, would get people more situated and immersed.

    Ah, I remember seeing something like that before. I was wondering what that was all about oon Light Bastion.

    And interesting preference.
     
  17. MadHatter

    MadHatter [WindyWeather]

    Joined:
    Feb 12, 2016
    Messages:
    1,796
    Likes Received:
    1,542
    Reading List:
    Link
    I believe the translator of "Undefeated God of War" over at Translation Nation has the best solution. He is using a two layer formula, the first layer use an algorithm to scramble the text so that if anyone tries to steal the content it is unreadable. The second layer makes it readable for viewers.
     
  18. Prosperous_Food

    Prosperous_Food Active Member

    Joined:
    Feb 22, 2018
    Messages:
    96
    Likes Received:
    92
    Reading List:
    Link
    Mad hatter, do you have any clues to what formula he used?
     
  19. MadHatter

    MadHatter [WindyWeather]

    Joined:
    Feb 12, 2016
    Messages:
    1,796
    Likes Received:
    1,542
    Reading List:
    Link
    @Prosperous_Food I don't know how he does it. I'm using a chrome app to read novels "Just Read" and when using it to read "Undefeated God of War", the text appear all scramble. Of all the novels I'm following, this translator seems to have the best mechanism to protect his content without impeding the readers experience.
     
  20. noisypixy

    noisypixy Sacatunn que pen, que summum que tun.

    Joined:
    Jun 25, 2016
    Messages:
    716
    Likes Received:
    950
    Reading List:
    Link
    Yet it's such a generic method that even "text in different color" is a better choice than this.

    Code:
    // Remove any "hidden" elements from the DOM.
    Array.from(
    	document.querySelectorAll('body *')
    )
    	.filter((el) => {
    		const cs = window.getComputedStyle(el);
    
    		return (
    			cs.display === 'none' ||
    			cs.visibility === 'hidden' ||
    			parseFloat(cs.opacity) < 0.5
    		);
    	})
    	.forEach(el => el.parentNode.removeChild(el));
    
    // Strip non-selectable property from Mozilla-based and WebKit-based browsers.
    Array.from(
    	document.querySelectorAll('body *')
    )
    	.forEach((el) => {
    		el.style.WebkitUserSelect = 'initial';
    		el.style.MozUserSelect = 'initial';
    		el.style.userSelect = 'initial';
    	});
    
    There. RIP.

    Seriously, use colors instead of this; at least your combination of colors might be different from other sites'. Whereas the "normal" hiding methods are the same for everyone.
     
Thread Status:
Not open for further replies.