Powered by MOMENTUM MEDIA
cyber daily logo
Breaking news and updates daily. Subscribe to our Newsletter

Google’s Magika file ID tool goes open-source

The AI-powered file identification system could be a boon for cyber security applications.

user icon David Hollingworth
Tue, 20 Feb 2024
Google’s Magika file ID tool goes open-source
expand image

Google has announced that its Magika file identification tool is now available on an open-source basis, with the project now being hosted on GitHub.

There is also a web demo available for immediate testing.

“Today, we are open-sourcing Magika, Google’s AI-powered file-type identification system, to help others accurately detect binary and textual file types,” Google said in a blog post.

============
============

“Under the hood, Magika employs a custom, highly optimised deep-learning model, enabling precise file identification within milliseconds, even when running on a CPU.”

File identification is a problem for traditional methods, which often rely upon “a handcrafted collection of heuristics and custom rules to detect each file format”, according to Google. This process takes time and, due to the human element, is rather error-prone. It’s also comparatively slow.

Magika, however, uses a deep-learning algorithm trained on the Keras API. It uses Onnx as an inference engine and is capable of identifying files within seconds. It’s only one megabyte in size and achieved a score of 99.31 per cent on Google’s own 1 million files benchmark.

The nearest competitor, file-magic 5.44, scored only 81.3 per cent.

Google uses Magika internally as a security tool.

“Magika is used at scale to help improve Google users’ safety by routing Gmail, Drive, and Safe Browsing files to the proper security and content policy scanners,” Google said.

“Looking at a weekly average of hundreds of billions of files reveals that Magika improves file type identification accuracy by 50 per cent compared to our previous system that relied on handcrafted rules.”

Magika will also soon be integrated into VirusTotal, which already uses Google’s artificial intelligence (AI) smarts to detect malicious files.

“Magika will act as a pre-filter before files are analysed by Code Insight, improving the platform’s efficiency and accuracy,” according to Google.

“This integration, due to VirusTotal’s collaborative nature, directly contributes to the global cyber security ecosystem, fostering a safer digital environment.”

David Hollingworth

David Hollingworth

David Hollingworth has been writing about technology for over 20 years, and has worked for a range of print and online titles in his career. He is enjoying getting to grips with cyber security, especially when it lets him talk about Lego.

newsletter
cyber daily subscribe
Be the first to hear the latest developments in the cyber industry.