The false news they have become a problem ever more predominant, and in recent years a series of technological advances based on artificial intelligence, unfortunately have managed to multiply the seriousness of this threat.
Both these technologies have much potential for good, but that they are being exploited for evil. Images, sound and videos fake, the famous ‘deepfakes’begin to flood the social networks, and the most serious problem we face now is that these are more difficult to detect than we thinkwhile the techniques used do not stop to move forward.
When it is difficult to detect more than 80% of the videos manipulated
In Engadget we speak with Andres Torrubiaco-founder and CEO of Fixr.is, known on Twitter as @antor, where it constantly publishes resources on a topic that you are passionate about: the artificial intelligencethe machine learning and deep learning.
Andres, who has come to contest the most important competitions worldwide on AI with teams pointers from China and the united States, said recently from his account that a few days ago I had completed one of the largest competitions of the world to detect deepfakes: Kaggleand the results leave us a thoughts of concern.
Few days ago ended in @kaggle one of the competitions with the biggest prizes of 1 million $ in total, the problem: the detection of DEEP FAKES (ultrafalsos in Spanish).
I tell you my thoughts 👇 pic.twitter.com/kh7JrsUl8Y
— Andres Torrubia (@antor) April 7, 2020
In your thread Andres explains that the best scores of the experts who participated in this competition to detect deepfakes have a hit rate of around 80%that may sound high, but we’re talking about videos are fake, they are virtually impossible to distinguish from a real one for any average person.
This also collides with detection rates of deepfakes in the majority of academic articles publishedhovering around up to 99%. The reason for this: the set of data used.
“In the machine learning and deep learning there is a key word that is to generalize. To generalize is that the system that you’ve trained with the data set that you use to train (training set) then it has to work equally well with real data (in this case deepfakes that you find there).
In many academic articles the set of data used to test detection systems deepfakes is very similar to that used for training, and that’s why you see hit rates sky-high; but this does not occurs then in reality; among other things because the systems to generate deepfakes evolve.
What are the systems of detection of deepfakes today
By now the vast majority of the deepfakes online porn but each time more concerned about the character of electoralthe potential for manipulating viral in the political sphere is increasingly alarming. Just look at the progress through parodies as ‘Team E’a viral of the political leaders in Spanish which proves that these videos are already unstoppable.
The big problem here is that the citizens are going to rely on detection algorithms that are created by these experts is that they do not strain up to 20% of the deepfakes, the damage this can cause to societies is enormous. And that 20% is not going to stop there, because, as explained by Andres, the systems continue to evolve.
The challenge is to establish whether the systems that detect deepfakes today they will serve to detect deepfakes that are made with techniques that appear in the next two or three months.
The authors of the academic articles they choose the data sets on which to train and also to validate the results; in many cases the sets on that validate do not represent worst-case scenarios in reality.
Although it may be a little controversial to say, in research it is considered good result to overcome the benefits reported by another research group so that there is a great incentive in that yourself as a researcher you get a set of validation much more difficult that the that have been made by other researchers.
How it works to combat this
Here is where it enters the importance of competitions such as Kagglethe organizers of the challenge deepfake there are companies like Facebook, Amazon, Microsoft, and more, including a committee that includes researchers from the university community and that they know clearly the current situation.
Torrubia explains to us that the use of competitions is very interesting to catalyze innovation because they reward results, not effort. These detection rates of 80% are given because the participants do not have access to the deepfakes they are going to test your algorithm; can’t even see them.
“This is critical because it looks much like the reality: in practice, your algorithm has to work well with any video, not just with those your same you’ve decided a priori; and the best way to check this is to do so as they have done.”
There is also the monetary component: “In total there is $1 million dollars in prizes. In addition to the own reputation that you would give to stay in good position of winning the prize, the prizes are hefty as to appeal to both experts in the problem itself, as well as other participants from any area.”
The detection of deepfakes is a problem is not resolved, and that it can become a game of cat and mouse in which counterfeiters take advantage. While more advanced techniques to create the videos, the more they must advance to the algorithms able to detect them. And us, we can no longer believe in all that we see or hear.
Cover image: Deeptrace
it was originally published in