You are not logged in.

#1 2018-05-16 2:29 pm

apico
Member
Registered: 2013-01-23
Posts: 8

Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Great project! Thanks for this help to detect spammers for years, but with new EU GDPR (DSGVO) I have removed this function from my website. I think this is far away from allowed to use with the actual API.

I see that a new parameter "emailhash" since 2016/10 exists. But why as insecure MD5? (sha1 is insecure too). So its not valid to use with EU GDPR (in my eyes).

I would like that all parameters MUST be send as hash sha256(64 characters). All direct API calls like Username, real IP or E-Mail-Address should be forbidden.

So instead of
https://www.stopforumspam.com/api?ip=1.2.3.4&email=mail%40domain.tld&f=json

A new api version that submit only sha256 values:
https://www.stopforumspam.com/apiv2?ip=6694F83C9F476DA31F5DF6BCC520034E7E57D421D247B9D34F49EDBFC84A764C&email=F220011F073A7CFC303FF774BD121B441820101E8DD0198B21B3B27EC1D11E01&f=json

Thats a very small change in the exists code base but make a big impact for privacy policy.

Last edited by apico (2018-05-16 4:29 pm)

Offline

#2 2018-05-16 8:40 pm

ronaldvanbelzen
Member
Registered: 2017-06-28
Posts: 7

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

I think this service is allowed and is already in accordance with GDPR, but not sure.

I also do not know about publishing the data that SFS collects unencrypted. You might have a point there.

About sending data: using POST instead of GET to check a spam entry over ssl might be better.

Last edited by ronaldvanbelzen (2018-05-17 7:48 am)

Offline

#3 2018-05-17 8:30 am

Dark Byte
Member
Registered: 2011-05-30
Posts: 13

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

when using SSL the URL (the part after the domainname) is encoded as well

Last edited by Dark Byte (2018-05-17 8:31 am)

Offline

#4 2018-05-17 8:36 am

apico
Member
Registered: 2013-01-23
Posts: 8

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Its not only a problem with sending, like you said, POST with SSL is maybe a solution here. This hint should be in the documentation, like that use https instead of http. I have include SFS looong time before SSL was popular. smile

The primary issue here is in my eyes, that stopforumspam receives for only simple checking personal data as unencrypted. For simply checking is a secure hash more as sufficient.

Last edited by apico (2018-05-17 8:44 am)

Offline

#5 2018-05-17 9:53 am

ronaldvanbelzen
Member
Registered: 2017-06-28
Posts: 7

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

A dummy post to be able to say what I actually want.

Offline

#6 2018-05-17 9:57 am

ronaldvanbelzen
Member
Registered: 2017-06-28
Posts: 7

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

This is an interesting blogpost about some other anti-spam service and GDPR.

It ends with describing some other service that seems to be able to comply to GDPR.

It is a marketing driven story, of course, but it does give some food for thought.

Last edited by ronaldvanbelzen (2018-05-17 10:34 am)

Offline

#7 2018-05-17 2:44 pm

JamesC
Member
Registered: 2010-01-09
Posts: 93
Website

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

I would like that all parameters MUST be send as hash sha256(64 characters). All direct API calls like Username, real IP or E-Mail-Address should be forbidden.

Please don't break the SFS API for those of us who are outside the EU and do not solicit EU eyeballs / pageviews / hitcounter stats.

You have the option to filter out potential EU citizens before running their IP, username, and/or email address through the SFS API:
- Do you reverse-lookup the IP to get their host? You may obtain a country from that (for example, t-connect.de indicates a German mobile connection).
- Does their email address end in .ie or .pl?
- various blocklists exist to identify connections from Russia, China, even Australia and the US. Reverse the logic; if an IP is on a Chinese blocklist then the connection does not originate from within the EU. wink

And please don't overlook the blatantly obvious: SFS is intended to help block registrants to a website (such as a forum). The registrant willingly gives us a username, email address, and IP as part of registration -- therefore they have consented to the collection, processing, and retention of these bits of personally identifying information.

All you need to do is add a notice just above where your website asks for these, notifying the registrant that by providing these details, they are giving consent. As long as the notice is provided before you collect personally identifying information, and the natural person has the choice to not provide that information (by abandoning their registration attempt), you are in compliance with GDPR. smile

And ... again, blatantly obvious ... if an EU citizen gives false information, such as a name or email address that is not their own, their false information is not protected by GDPR.

BTW, bots are not "natural humans" under GDPR; bots have no GDPR rights at all. wink

Last edited by JamesC (2018-05-17 2:45 pm)

Offline

#8 2018-05-17 3:58 pm

apico
Member
Registered: 2013-01-23
Posts: 8

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Like I said in my start post, I don't want that this feature replace the exists api, I suggest a apiv2 version for this or new parameters.

So cool down :D

EDIT: Your complete logic is wrong (my opinion). I'm a EU citizen, I can be in holiday in russia, so I have a russian IP or other countries or proxy IP. If a EU website use my data, I can take a lawyer. Your rights are not limited of a stupid IP-Address or the domain of Mail-Address. So your tests are useless. In EU you never should submit personal data without permission or much better only encrypted as secure hash.

Last edited by apico (2018-05-17 4:13 pm)

Offline

#9 2018-05-17 7:32 pm

ronaldvanbelzen
Member
Registered: 2017-06-28
Posts: 7

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

"In EU you never should submit personal data without permission"

Iirc there are 6 criteria that define when you are allowed to gather personal data, and permission is only one of them.

Offline

#10 2018-05-17 8:38 pm

apico
Member
Registered: 2013-01-23
Posts: 8

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Don't forget, a user can anytime withdraw his permission and you must delete his data everywhere. How do you delete their data here on SFS?

And you must make a contract with SFS what acutal is not possible? Hash encrypting is the only chance to use this project.

Last edited by apico (2018-05-18 2:24 pm)

Offline

#11 2018-05-21 12:05 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,054

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

GDPR makes no reference to the minimum security  requirements used for encryption or hashing.  MD5 is considered cryptographically insecure if you wish to use it as a method of hashing a master password in order to encrypt a secret key, but no crypto is being done here, its a simple one way obfuscation method supported by every platform. A reminder, that by hashing email addresses, you completely remove the ability to process blacklisted domains.

You can POST to the API if you believe that someone is running a man-in-the-middle attack on your HTTP connections, and you can also use HTTPS (if you have a modern client supporting SNI), ie

curl -o- "https://europe.stopforumspam.org/api" -d "ip=1.2.3.4"

Web server logs are retained for 5 days, and then deleted, mainly because they simply arent needed after that, and I dont have the server space for them.  I cant remember the last time I even looked at the logs.  POST data is NOT logged.  If you want to ensure your queries are not logged, remain without the EU, and not open to MITM attacks, then POST via HTTPS to the domain below

You can force all your queries to remain within the EU by querying europe.stopforumspam.org/api

All in all, as a British citizen myself, I think GDPR is a horrible piece of legislation due to its poor definitions and huge lack of consideration, which is often the case when laws governing technology are written by people with no, or very poor understanding of the platform that will eventually be attacked by it.

As far as the legal scope of GDPR, it applies to companies that process PII of citizens on EU countries.  This website is not a company, nor is it operated or owned by one.

To fall within the remit of the GDPR, the processing has to be part of an “enterprise”. Article 4(18) of the Regulation defines this as any legal entity that’s engaged in economic activity.  SFS engages in no economic activity.  This is where the vagueness of the legislation all fails.  Nothing is stopping anyone from posting someone's details on Twitter, but Twitter will remove that data if contacted.  SFS will do the same.

SpamHaus, SpamCop (etc) will all be facing this issue, and as GDPR is enforced by EU courts, rather than civil action, I'm going to put money on them being swamped with pointless cases from day #1, 99.999% without any merit.

A username by itself in not PII, however combined with an email address, it could be PII, but that will be established by the courts as the legislation is not specific as of yet.

Now, this is interesting, as GDPR has exemptions, including
- the prevention, investigation, detection or prosecution of criminal offences;
- other important public interests, in particular economic or financial interests, including budgetary and taxation matters, public health and security;

Stopping someone from posting adverts for illegal drugs, which could be fatal, to stop someone from selling illegal pass ports which could be used for criminal fraud, terrorism etc, certainly something that I would consider public interests and prevention of criminal activities.

This entire process would need legal input.  As the website has exactly $0.00 income (and $0.00 outgoing), I'm not going to front the 4 to 5 figure costs in order to consult a lawyer.  If it comes to this then I would have to shut SFS down, so let's hope that doesnt happen.  Now, if anyone here is a GDPR lawyer or knows of one that would be happy to help a community driven crime prevention project such as this, please do let me know.  Before a shutdown happens, I would love the server to the US and the EU courts can try to get 10 million Euros out of me from there

Offline

#12 2018-05-21 7:21 pm

kpatz
Member
Registered: 2008-10-09
Posts: 1,437

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

I'm not in the EU, nor am I familiar with this GDPR, but there's only two ways SFS "sees" possibly personal identifying information:

1.  When an API call is done to query the database, such as during registration or when a forum admin queries the API to see if the information could be a spammer, and
2.  When a spammer's information is submitted to the database.

The information in question is the username, email and IP address of the registration, of course.  About the only thing that would make it "personally identifiable" is if the registrant uses their real name as their username, or their email contains their real name.

API queries aren't stored either, so in order to be intercepted there would have to be a man in the middle attack, or a compromise on one end of the connection (the forum's server or SFS's).

If a spammer is submitted, that information is stored in the SFS database, but spammer's data is mostly fake anyway... no spammer uses their real name or email when registering on forums, unless they're dumber than most... and I think the law is written in such a way that if there is criminal activity involved (such as what spamming usually is) then it wouldn't count anyway.

So, IMHO, the only concern would be API queries, and allowing all three data values to be hashed, or allow for SSL POST would take care of that.


Spam happens when greed meets stupidity.

Offline

#13 2018-05-23 1:18 pm

apico
Member
Registered: 2013-01-23
Posts: 8

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Thanks pedigree for your answer. smile

I'm not a lawyer, but with http://eur-lex.europa.eu/legal-content/EN/TXT/HTML/?uri=CELEX:32016R0679&from=DE Article 4 "Definitions" you can read what a personal data means. In fact, german law said here a "username" is personal data too. You can identify very often a person with it.

The new law is for everyone, private persons and no commercials projects when you collect personal data in any way.

When I submit or give access to personal data, I need a contract with the partner. In germany we are MUST send a letter to Google Irland with a contract and get it back in paper. To use Google Analytics without this contact as real paper, is in germany to use GA illegal. That beside, all companies with you are share personal data must exists a provable contract and what the partner do with the personal data. Like with webhoster, or technical support or external newsletter service. So with SFS its needed to. sad

So it would be better to submit only hash and not real ip or email or username. (MD5 is as hash not secure enough.) So you can see that hash exists in your DB. You should not replace your DB, only add new fields for hashed username, email and ip.  So nobody must send personal data. smile

Offline

#14 2018-05-24 3:28 am

Maikuolan
Member
From: Perth, Western Australia
Registered: 2011-08-09
Posts: 799
Website

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Sending hashes are a good idea, but I wonder how technically feasible it would be to do this for SFS? Given that by their nature, hashes are meant to be "one way", in order to actually check whether a "hash" (or the corresponding data for it) exists in the database, I imagine that SFS would likely need to maintain a "rainbow table" of sorts (slightly bending the meaning here of the term thereof, but still) for all entries in its database, effectively doubling the size of its database. Given the limitations that already exist in terms of available disk space and such, I doubt that anything which could potential double the database size would be technically feasible. At least, not within the 24 hours remaining until GDPR/DSGVO comes into effect.

Offline

#15 2018-05-26 10:39 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,054

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

MD5 is perfectly adequate for what its doing.  Its not cryptographically secure in that it should not be used to hash a password used to encrypt a master key used to further secure traffic (as with SSL, Disk Encryption etc) but as a method to obscure data and to provide downloadable file checksums, MD5 is fine. 

Brute forcing SHA256 vs MD5 for an IP address would take exactly the same time, maybe even faster with SHA256 given the single operation GPU acceleration provided on AMD cards now.  Usernames are usually under 10 characters, SHA512 vs MD5 bruteforce for that is an hour, email addresses have a smaller known cleartext limit, so days to build tables that would cover listed domains.

I'll use SHA512 in the APIv2, however I'll have a look at the overhead of a back port this to the current API due to the way that data is stored in binary encoded format in Redis.  Each of the geographically redundant Redis nodes only runs on a small server, and the added memory required for lookups could be a problem.  At the moment, only the first 64 bits of any hash are stored anyway, as I've never been able to collide more than 42 bits of data.  I can do the same with SHA512, only store the first 64 bits and use a hashmap to map the SHA to the MD5.  As the hash of both is stored incomplete, you would have to see a MITM attack on a 2048bit SSL link, in order to grab POST data, before you could then brute force either MD5 or SHA512.... something that would be either nation state or a local network poisoning attack.

I will have to add support for nest array queries so that the domain part of an email address can be included in the call so that you can still utilize blacklisted domains.  i wont mandate hashed submissions as this kills the ability to do wildcard testing, blacklisted domains, IP blacklists.   As hash queries requires further client side processing (such as normalizing email addresses, something that we can do on the fly but would then require every client deployed to update) hashing should only be used where no other option exists.

As this project is funded entirely out of my own pocket and maintained in my own (very limited) spare time, and as blunt as it sounds, I have little interest in adhering to every individual countries privacy laws as they change, especially the ones that require lawyers and contracts.   The time involved, and the legal costs would simply become so much that projects like this would just shut the doors leaving the criminal spammers the clear winners.  This is the best reason for ensuring that your projects stay well out of German territory.

If anyone has concern about a PII data being sent, and the contractual issues behind that, then I suggest that permission to use PII is obtained prior to it being checked.  if someone refuses to agree to having their credentials checked, then you can rightly refuse them access to the service you provide.  I can guarantee you that banks will continue using credit agencies to verify applications, and refusal to allow checks will result in applications being declined.

If there is a lawyer that has enough spare time to go over every countries privacy laws, data handling legislation, encryption disclosures, for free, then please let me know.

Offline

#16 2018-05-26 10:44 pm

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,054

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

i'll have a poke around with a SHA512 to MD5 mapper for the datastores.  You guys have little interest in the internal hashmap storage format of data, and it should do what is needed without blowing up the limited amount of memory that I have.

Edit 1: sha512 support will only be via POST method, and only via json queries.  This is so that I can support email domain lookups/blacklisting.  beyond providing a domain as cleartext or hashed, I will make no effort to do domain blacklist matching, which is really a huge thing now. 

caveat emptor

Offline

#17 2018-05-27 2:32 am

Maikuolan
Member
From: Perth, Western Australia
Registered: 2011-08-09
Posts: 799
Website

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

Sounds good to me. :-)

Worth noting: Given that SFS is not an "enterprise", as defined by GDPR (i.e., less than 250 "employees", and not engaging in business activities or any activities likely to generate revenue of any kind; if, e.g., one implements Google Adwords to their site to earn a few extra dollars, then even as a small, one person operation, for say, a blog, or internet forum, then they technically then become an "enterprise"; though with zero business, zero earnings or related activities, and less than 250 people "employed" by the project, I don't think is an "enterprise", under the definition given by GDPR), the specific clauses regarding the processing of PII wouldn't be quite as strict for SFS as they would be for say, Facebook, or Google.

A big, important disclaimer: I'm not a lawyer, not a legal professional of any kind, so I'm not about to legally guarantee the accuracy of whatever I write here (in short: take it as opinion only). ;-)

But, based on what I've read of the new regulation and our obligations in regards to it, and based on my understanding of what I've read, the main thing that SFS really would need to do (from the perspective of SFS itself), is document everything. Obviously, SFS would need to process IP addresses, usernames, email addresses, etc, because.. that's what it does. It's a database intended to fight spammers, and those are the data points it uses to do so. There is, however, grounds to argue a legal basis for processing these data points (IMO): They aren't used by SFS for marketing or advertising, aren't on-sold to third parties, and they're used in a way that is integral to the service that SFS provides. Obvious stuff, that everyone using SFS should already know, but just including some simple, static HTML page somewhere on the server, that says "this is what we process (queried data), this is what we collect (submitted data), and this is why (the reasons; what's done with it)", would be legally helpful, in that, if someone in the future complains about SFS being non-compliant, or questions what's done with the data, we can point them to that static HTML page, tell them to start reading, that the page has existed since whenever and so on. Asinine, obvious stuff, but as is the case with most legally-oriented documentation. Transparency, and working towards users being well-informed, I think, is one of the most important considerations of GDPR (from the perspective of non-enterprise entities).

Include some disclaimer on the page too, which says, "if you query data from SFS, or submit data to SFS, you agree that you are legally authorised to do so, are responsible for your own choices; SFS is not responsible for the actions that you yourself decide to undertake of your own volition" (and so on). At that point, the buck is passed to said users and third parties (and away from SFS) as to whether whatever they do with the SFS service and its data happens to comply or not comply with GDPR.

Clear links to privacy policies is important too, but you've already got that covered (the privacy policy for SFS can be found on all page footers already).

Beyond that.. Getting to the nitty-gritty of full compliance in terms of the PII processed and stored, I'm not convinced it would even be entirely possible in all cases. Hopefully the non-enterprise status, and implementing the aforementioned documentation suggestions, should be sufficient as so that no legal firms anywhere will actually care enough to bother worrying about SFS at all.

Offline

#18 2018-05-27 7:42 am

pedigree
uıɐbɐ ʎɐqǝ ɯoɹɟ pɹɐoqʎǝʞ ɐ buıʎnq ɹǝʌǝu ɯ,ı
From: New Zealand
Registered: 2008-04-16
Posts: 7,054

Re: Feature-Request: Make this poject safe for EU GDPR (DSGVO)

So, a quick technical update. The API uses Redis (my custom version) for queries, one that uses Interval Sets (excluded from these numbers) to perform IP address space lookups.  It uses hashsets in order to store a lot of small data in the smallest memory set available.  So far some hours of playing around the data isolation to avoid further hash collisions (ie identical username and email addresses), Ive some numbers that Im happy with going forward.

Going to current redis memory usage

# Memory
used_memory:171927864
used_memory_human:163.96M
used_memory_rss:179372032
used_memory_rss_human:171.06M
used_memory_peak:171927864
used_memory_peak_human:163.96M

# Keyspace
db0:keys=65149,expires=0,avg_ttl=0

to a cache where there is full support for SHA512 lookups of all data, including support for MD5 of email addresses to support the existing API functionality.  As only the top 64 bits are ever being used, the remaining hash data is discarded.

# Memory
used_memory:280994304
used_memory_human:267.98M
used_memory_rss:338268160
used_memory_rss_human:322.60M
used_memory_peak:283489904
used_memory_peak_human:270.36M

# Keyspace
db0:keys=65149,expires=0,avg_ttl=0
db1:keys=65536,expires=0,avg_ttl=0
db2:keys=65536,expires=0,avg_ttl=0
db3:keys=65536,expires=0,avg_ttl=0
db4:keys=65536,expires=0,avg_ttl=0

A total memory increase of about 100MB is well within the memory limits available on each of the servers running API nodes.

Offline

Board footer

Powered by FluxBB

Close
Close