Yasser Salem


e: Yasser [dot] Salem [at] gmail [dot] com

Follow updates on my research and teaching on UCD!

About me

I am a PhD student at the University College Dublin (UCD), currently working on conversational recommender systems with Dr. Kevin McCarthy and Prof. Barry Smyth.

I received my Masters degree from the Institute of Technology, Blanchardstown, completing my MSc thesis in 2009, entitled, "A generic framework for Arabic to English machine translation of simplex sentences using the Role and Reference Grammar linguistic model". This research was the first contribution using the Role and Reference Grammar (RRG) model as a basis for machine translation. My MSc thesis is available on the official Role and Reference Grammar website. My advisors were Dr. Brian Nolan and Mr. Arnold Hensman.



While working on my MSc, I published 6 papers. I was a reviewer for the International Arab Conference on Information Technology (ACIT 2008). I also delivered an invited talk at Dublin City University (DCU) in July 2008 entitled "UniArab: a universal machine translator system for Arabic based on Role and Reference Grammar".


Education

PhD Research:
University College Dublin
PhD, Artificial Intelligence, 2010--In Progress.

Recommender systems are a common way to promote products or services that may be of interest to a user, usually based on some profile of interests. The single-shot approach, which produces a ranked list of recommendations, is limited by design. It works well when a user’s needs are clear, but it is less suitable when a user’s needs are not well known, or where they are likely to evolve during the course of a session. In these scenarios it is more appropriate to engage the user in a recommendation dialog so that incremental feedback can be used to refine recommendations. This type of conversational recommender system is much better suited to help users navigate more complex product spaces. I am interested in improving the efficiency of recommender systems.


MSc Research:
Institute of Technology Blanchardstown (Dublin, Ireland)
MSc, Computational Linguistics, 2007 — 2009


Machine translation is a sub-field of computational linguistics that investigates the use of computer software to translate text or speech from one natural language to another. Arabic is one of the six major languages of the world, and one of the six official languages of the United Nations. The main motivation of this research is to provide a proof of concept implementation for translating Arabic into English, in order to bridge the gab between English speakers and Arabic speakers.

The aim of this research is to develop a rule-based lexical framework for Arabic language processing using the Role and Reference Grammar linguistic model. A system, called UniArab is introduced to support the framework. The UniArab system for Modern Standard Arabic (MSA), which takes MSA as input in the native orthography, parses the sentence(s) into a logical meta-representation, and using this, generates a grammatically correct English output with full agreement and morphological resolution. UniArab utilizes an XML-based implementation of elements of the Role and Reference Grammar theory, and its representations for the universal logical structure of Arabic sentences.

Role and Reference Grammar (RRG) is a functional theory of grammar that posits a direct mapping between the semantic representation of a sentence and its syntactic representation. The theory allows a sentence in a specific language to be described in terms of its logical structure and grammatical procedures. RRG creates a linking relationship between syntax and semantics, and can account for how semantic representations are mapped into syntactic representations. We claim that RRG is highly suitable for machine translation of Arabic via an Interlingua bridge implementation model. RRG is a mono strata-theory, positing only one level of syntactic representation, the actual form of the sentence and its linking algorithm can work in both directions from syntactic representation to semantic representation, or vice versa. In RRG, semantic decomposition of predicates and their semantic argument structures are represented as logical structures. The lexicon in RRG takes the position that lexical entries for verbs should contain unique information only, with as much information as possible derived from general lexical rules. For this reason and due to the functional nature of our linguistic model, we created our own lexicon.

We use the RRG theory to motivate the architecture of the lexicon and the RRG bidirectional linking system to design and implement the parse and generate functions between the syntax semantic interfaces. Through an input process with seven phases, including morphological and syntactic unpacking, UniArab extracts the universal logical structure of an Arabic sentence. Using the XML based metadata representing the RRG logical structure (XRRG), UniArab accurately generates an equivalent grammatical sentence in the target language through four output phases. We outline the conceptual structure of the UniArab System which utilizes the framework and translates the Arabic language into another natural language. We follow the Interlingua design approach for MT. We analyse the Arabic sentences to create a universal, abstract logical representation, and from this representation we generate English translations. We also explore how the characteristics of the Arabic language will affect the development of a Machine Translation (MT) tool. The UniArab system has been tested by generating equivalent grammatical sentences, in English, via the universal logical structure of Arabic sentences, based on MSA Arabic input with very significant and accurate results. It provides more accurate translations when compared with automated translators from Google and Microsoft though these systems have a much wider coverage than UniArab at present. This research demonstrates the capabilities of the Role and Reference Grammar as a base for multilingual translation systems.

B.Sc. (Hons):
Institute of Technology Blanchardstown (Dublin, Ireland)
Bachelor of Science (Honours) in Computing, 2003 — 2007

I received a B.Sc. (Hons.) in computing with first class honours from the School of Informatics and Engineering at ITB. During my undergraduate studies, I designed and implemented a new encryption algorithm using technologies such as Java, MD5 and ODBC. My algorithm is secure, using public and private keys and 256 bit encryption based on XOR. The project applied mathematical thinking of strategies, in particular algebra and calculus, to the design and creation of the encryption algorithm in software.

The project achieved a prize for the best software project in May 2007.


Publications

Peer-Reviewed Conference Papers

K. McCarthy, Y. Salem and B. Smyth, “Experience-Based Critiquing: Reusing Critiquing Experiences to Improve Conversational Recommendation”, in Proceedings of the 18th International Conference on Case-Based Reasoning (ICCBR 2010), Alessandria, Italy, July 2010. [PDF]

B. Nolan and Y. Salem, “UniArab: An RRG Arabic-to-English Machine Translation Software”, in Proceedings of the 2009 International Conference on Role and Reference Grammar, University of California, Berkeley, USA, August 2009. [PDF]

Y. Salem and B. Nolan, “Designing an XML Lexicon Architecture for Arabic Machine Translation Based on Role and Reference Grammar”, in Proceedings of the 2nd International Conference on Arabic Language Resources and Tools (MEDAR 2009), Cairo, Egypt, April 2009. [PDF]

Y. Salem and B. Nolan, “An Arabic-to-English Machine translation system using an XML–based Role and Reference Grammar representation”, abstract accepted for the 23rd Annual Symposium on Arabic Linguistics, University of Wisconsin-Milwaukee, USA, April 2009.

Y. Salem and B. Nolan. 2009. UNIARAB: An Universal Machine Translator System For Arabic Based On Role And Reference Grammar, in Proceedings of the 31st Annual Meeting of the Linguistics Association of Germany (DGfS 2009), University of Osnabruck, Germany, March 2009. [PDF]

Y. Salem, A. Hensman and B. Nolan, “Implementing Arabic-to-English Machine Translation using the Role and Reference Grammar Linguistic Model” in Proceedings of the Eighth Annual International Conference on Information Technology and Telecommunication (ITT 2008), Galway, Ireland, October 2008. (Runner-up for Best Paper Award) [PDF]

Journal Papers

Y. Salem, A. Hensman and B. Nolan. 2008. Towards Arabic to English Machine Translation. In ITB Journal, May 2008, Issue No. 17: 20-31. [PDF]

Book Chapter

Nolan, Brian and Yasser Salem, UniArab: RRG Arabic-to-English Machine Translation, In: New Perspectives in Role and Reference Grammar, Watara Nakamura (ed.), London: Cambridge Scholars Publishing, 312-344, December 2011

MSc Thesis

Y. Salem, "A generic framework for Arabic to English machine translation of simplex sentences using the Role and Reference Grammar linguistic model" MSc Thesis, ITB, 2009. [download thesis].

Elsewhere

You can also view my Google Scholar Citation Profile.

View Yasser Salem's profile on LinkedIn View Yasser Salem's profile on LinkedIn View Yasser Salem's profile on LinkedIn View Yasser Salem's profile on academic research microsoft