June’s NAACL conference saw machine learning specialists from technology company Iprova present a paper introducing a new and effective method for the unsupervised training of machine learning algorithms to infer sentence embeddings. The NAACL (North American Chapter of the Association for Computational Linguistics) Human Language Technologies (HLT) conference took place at the Hyatt Regency New Orleans hotel, Louisiana, from June 1–6, 2018.
The research paper, entitled “Unsupervised Learning of Sentence Embeddings using Compositional n-Gram Features”, will be presented by Matteo Pagliardini. Pagliardini is a senior machine learning engineer at Iprova and one of the three scientists that authored the research paper and developed the new model for unsupervised training, Sent2Vec. The other authors are Prakhar Gupta and Professor Martin Jaggi of École polytechnique fédérale de Lausanne (EPFL).
While there have been several successes in deep learning in recent years, the paper notes that these have almost exclusively relied on supervised training. Pagliardini cites a specific research paper by Mikolov et al (2013) as being particularly worthy of note for the success of semantic word embeddings — representations of words with similar meanings — trained unsupervised. The new paper presents a way of finding similar success for longer sequences of text rather than individual words.
“There are very useful semantic representations available for words but producing and learning semantic embeddings for longer text has always proven difficult”, explained Pagliardini. "It was especially challenging to see whether such general-purpose representations could be obtained using unsupervised learning."
“By taking inspiration from the existing C-BOW model of the Word2Vec algorithm, we were able to develop a computationally efficient method to train sentence embeddings. Our evaluations found that our method achieves a better performance on average than most other models, with a particular proficiency in evaluating sentence similarity. At NAACL HLT, we will explore our research further and explain where future work may take our Sent2Vec model.”
The paper was accepted for the NAACL HLT conference after an extensive review process from leading figures in the computational research community. The Sent2Vec model outlined in the paper is open source and available for use.
Sent2Vec in practice
Sent2Vec forms part of Iprova’s pioneering technology that provides a data-driven approach for the creation of commercially relevant inventions. Hundreds of patents have been filed based on Iprova’s inventions by some of the world’s most respected technology companies. The specialised algorithm allows the right invention to be created at the right time in a way that has never before been possible – and over 20 of the world’s best-known businesses have already benefited.
The technology brings together topics from seemingly distant areas, for example, inventively connecting an advance in geographic mapping to elevator scheduling, or an advance in autonomous vehicle control systems to personal healthcare. Other examples include connecting a specific drug delivery technique to a high value oil exploration problem, and the introduction of LED backlit displays to gesture recognition.
The company has kept a discreet profile since its inception, allowing global brands including Philips, Panasonic, and Deutsche Telekom to file hundreds of new patents based on its inventions. These inventions may provide the foundation for new products and services across a wide range of industries and sectors.
Iprova’s inventions are driven by advances outside of the areas where its customers are active, and are complementary to those created in their R&D labs.
Iprova’s approach creates inventions which have an improved chance of being disruptive due to their diversity and timing. During a recent project with Philips focused on nutrition, Iprova contributed to inventions driven by diverse advances in areas including healthcare, video processing, materials, genetics and predictive learning.
"Iprova complements Philips' own research activities with its out-of-the-box inventions”, says Maaike van Velzen, Head of IP Portfolio Management at Philips. "I am very impressed with Iprova’s technical expertise and advanced thinking."
Due to the success of the company’s technology, Iprova has grown significantly since its formation in 2010. Now, the company has three offices: one in Lausanne, Switzerland; one in Cambridge, UK; and a newly opened one in London. As a result, Iprova is now looking to grow its team of invention developers to match both the expanding capabilities of its AI system and the growing market demand for intelligent invention.
Jasper Van den Berg, an invention developer working at Iprova’s head office in Lausanne, asserts that Iprova’s invention developer role redefines inventors for the digital age. “Traditional inventors were scientists or engineers with a deep understanding of a specific technical field. This only gave the inventor access to a limited amount of research insight,” explains Van den Berg.
“Even collaborative inventing through teamwork only provides insight into a handful of additional fields, since it’s just a team of specialists. With such approaches to invention, researchers can only dig deeper into specific areas rather than offering genuine innovation by taking the field in a different direction.
“Iprova does this on a massive scale – in real-time – by using data from across the spectrum of human knowledge to make connections between ideas from different fields of study.”
Iprova’s invention developer role provides a unique perspective on this. The job involves scientists and engineers working in a role made possible thanks to AI, with invention developers using data presented by Iprova’s intelligent algorithms to create inventions that define the products and services of tomorrow.
“The invention developer is a job that goes hand in hand with technological advancement,” explains Julian Nolan, founder of Iprova. “Iprova is transforming invention by making use of data, algorithms and machines to streamline the research process and create inventions much faster and with greater diversity than would otherwise be possible.
“Our technology and invention developers have been successful in creating landmark inventions for some of the world’s best-known companies in the US and Asia, as well as in much of Europe. It allows us to operate in industries as diverse as healthcare, autonomous vehicles, finance and energy, which is only possible thanks to the data processing capabilities of our data-driven approach to invention. Our system has delivered jobs for the local economy and value to businesses and markets worldwide.”
Julian Nolan, CEO and founder, Iprova
Image Credit: Computerizer / Pixabay