In this day and age, technology is king. We rely on our smartphones and web browsers to access news and information. But let’s be honest, sometimes it’s a real pain trying to navigate through all the clutter on websites. The complex layouts, annoying navigation elements, and unnecessary links can really mess with the reading experience. And it’s even worse for people with accessibility needs.
Luckily, there’s a solution. Android and Chrome users can take advantage of Reading Mode, a feature that improves accessibility and makes reading a breeze. With Reading Mode, you can customize the contrast, adjust the text size, and choose more legible fonts. You can even use text-to-speech utilities to have articles read out loud to you. It’s pretty awesome.
But we didn’t stop there. We wanted to expand the capabilities of Reading Mode to include all kinds of content, not just news articles. And we wanted to do it without compromising your privacy. So we came up with a wicked cool on-device content distillation model.
Now, let me break it down for you. We used graph neural networks to tackle this problem. Instead of relying on complicated rules and heuristics, we took a data-driven approach. We collected a bunch of examples of different types of content and labeled them. Then we used those labeled examples to train our model.
But here’s the really cool part. We used accessibility trees instead of the traditional DOM approach. Accessibility trees give us a streamlined and more accessible representation of the web content. So we can distill the essential parts of the content and make reading even easier.
We developed a special tool that uses graph neural networks to process the accessibility trees. These networks are great for dealing with tree-like structures because they can naturally understand the connections and relationships within the tree. It’s like a family tree, where each node represents a family member and the connections show the relationships. The graph neural networks can learn these connections on their own, without the need for manual feature crafting.
Our architecture is based on the encode-process-decode paradigm. We start with the tree representation of the content, compute lightweight features for each node, and then use the graph neural network to propagate information through the edges of the tree. This allows the nodes to share contextual information with each other and provide a more informed basis for classification. After a few rounds of this process, we can decode the nodes into essential or non-essential classes.
Now, I know what you’re thinking. How do we handle different languages and keep the model lightweight? Well, we restricted the feature set used by the model to increase its generalization across languages and reduce the size. Our Android model is only 334kB in size, with a latency of 800ms. And the Chrome model is 928kB in size, with a latency of 378ms. Pretty impressive, right?
And here’s the best part. We made sure that all the processing happens on your device. Your data never leaves your device, so you can rest easy knowing that your privacy is protected. We’re all about responsible technology here.
So, how does it perform? Well, let me tell you. Our models have been trained on a bunch of webpages and native applications. And the results are awesome. We have an F1-score exceeding 0.9 for main-text content, which means that 88% of articles are processed without missing any paragraphs. And in over 95% of cases, the distilled content is valuable for readers. You can trust that the content you’re getting is pertinent and accurate.
But don’t just take my word for it. We compared our content distillation with other models, and we came out on top. Our models outperformed DOM Distiller and Mozilla Readability on a set of English language pages. The quality speaks for itself.
So there you have it. Reading Mode just got a whole lot better. We’ve expanded its capabilities, improved its performance, and made sure your privacy is protected. It’s a win-win situation. Happy reading!