Education

"Per aspera ad astra"
  • Ph.D. 2015 - 2018

    Deep Learning, Computer Engineering

    National University of Singapore

  • Ph.D. 2014 - 2015

    NLP & IR, Computer Science

    National University of Singapore

  • M.Eng. 2011 - 2013

    Software Engineering for ML

    Ss. Cyril and Methodius University

  • B.Sc.2006 - 2010

    Computer Engineering

    Ss. Cyril and Methodius University

Work Experience

 
  • 2015 - 2016

    Teaching Assistant

    NUS Modules: CS1101S and CS3226 - Feedback

  • 2010 - 2014

    Software Engineer

    Freelancer: Web, Mobile, Server

  • Jun - Sep 2013

    Research Assistant Intern

    Institute for Infocomm Research, A*STAR

  • Jun - Sep 2009

    Software Engineer Intern

    Netcetera, AG

Publications

  • 2017
    Efficient Hyperparameter Optimization of Deep Learning Algorithms Using Deterministic RBF Surrogates
    Automatically searching for optimal hyperparameter configurations is of crucial importance for applying deep learning algorithms in practice. Recently, Bayesian optimization has been proposed for optimizing hyperparameters of various machine learning algorithms. Those methods adopt probabilistic surrogate models like Gaussian processes to approximate and minimize the validation error function of hyperparameter values. However, probabilistic surrogates require accurate estimates of sufficient statistics (e.g., covariance) of the error distribution and thus need many function evaluations with a sizeable number of hyperparameters. This makes them inefficient for optimizing hyperparameters of deep learning algorithms, which are highly expensive to evaluate. In this work, we propose a new deterministic and efficient hyperparameter optimization method that employs radial basis functions as error surrogates. The proposed mixed integer algorithm, called HORD, searches the surrogate for the most promising hyperparameter values through dynamic coordinate search and requires many fewer function evaluations. HORD does well in low dimensions but it is exceptionally better in higher dimensions. Extensive evaluations on MNIST and CIFAR-10 for four deep neural networks demonstrate HORD significantly outperforms the well-established Bayesian optimization methods such as GP, SMAC, and TPE. For instance, on average, HORD is more than 6 times faster than GP-EI in obtaining the best configuration of 19 hyperparameters.

    Ilija Ilievski, Taimoor Akhtar, Jiashi Feng, and Christine Annette Shoemaker

    AAAI-17 PDF Supplement Poster Code BibTeX

  • 2016
    Hyperparameter Transfer Learning through Surrogate Alignment for Efficient Deep Neural Network Training
    Recently, several optimization methods have been successfully applied to the hyperparameter optimization of deep neural networks (DNNs). The methods work by modeling the joint distribution of hyperparameter values and corresponding error. Those methods become less practical when applied to modern DNNs whose training may take a few days and thus one cannot collect sufficient observations to accurately model the distribution. To address this challenging issue, we propose a method that learns to transfer optimal hyperparameter values for a small source dataset to hyperparameter values with comparable performance on a dataset of interest. As opposed to existing transfer learning methods, our proposed method does not use hand-designed features. Instead, it uses surrogates to model the hyperparameter-error distributions of the two datasets and trains a neural network to learn the transfer function. Extensive experiments on three CV benchmark datasets clearly demonstrate the efficiency of our method.

    Ilija Ilievski and Jiashi Feng

    PDF

  • 2016
    A Focused Dynamic Attention Model for Visual Question Answering
    Visual Question and Answering (VQA) problems are attracting increasing interest from multiple research disciplines. Solving VQA problems require techniques from both computer vision for understanding the visual contents of a presented image or video, as well as the ones from natural language processing for understanding the semantics of the question and generating the answers. Regarding visual content modeling, most of existing VQA methods adopt the strategy of extracting global features from the image or video, which inevitably fails in capturing fine-grained information such as the spatial configuration of multiple objects. Extracting features from auto-generated regions — as some region-based image recognition methods do — cannot essentially address this problem and may introduce some overwhelming irrelevant features with the question. In this work, we propose a novel Focused Dynamic Attention (FDA) model to provide better-aligned image content representation with proposed questions. Being aware of the key words in the question, FDA employs off-the-shelf object detector to identify important regions and fuse the information from the regions and global features via an LSTM unit. Such question-driven representations are then combined with question representation and fed into a reasoning unit for generating the answers. Extensive evaluation on a large-scale benchmark dataset, VQA, clearly demonstrates the superior performance of FDA over well-established baselines.

    Ilija Ilievski, Shuicheng Yan and Jiashi Feng

    PDF       BibTeX

  • 2014
    Intelligent Tool for Modelling and Simulation of Urban Development
    This master thesis presents a tool that employs intelligent technologies to capture the patterns of urban change driven by a diverse set of context factors. Data mining provides opportunities, which complement and extend the knowledge previously obtained with other approaches. A series of exploratory case studies were done to showcase the modeling and predictive capabilities of the tool. A number of simulations have revealed distinctive local patterns of urban change in the city of Skopje, shaped by local urban spatial and institutional structures. This thesis shows the importance of intelligent technologies in the interpretation of the historical evidence of urban development.

    Ilija Ilievski, Master Thesis, Ss. Cyril & Methodius University, Macedonia.

    PDF (in Macedonian)

  • 2013
    Personalized News Recommendation Based On Implicit Feedback
    This paper presents a personalized news recommendation system that combines effective ways of understanding new articles with novel ways of modeling evolving user interest profiles to deliver relevant news articles to a user. A news article is represented as a taxonomy of hierarchical abstractions that capture different semantic facets of the news story. A user's interest profile is modeled as an evolving interest over these facets. User's interest in individual articles is determined using a novel SWL (select-watch-leave) interest modeling framework that leverages on a detailed analysis of his usage history. Initial performance comparisons with state-of-the-art personalized ranking approaches are promising.

    Ilija Ilievski and Sujoy Roy, ACM RecSys.

    PDF       BibTeX

  • 2013
    Discovering Patterns of Urban Development
    The goal of this research is to develop a tool that employs intelligent technologies to capture the patterns of urban change driven by a diverse set of context factors. Data mining provides opportunities, which complement and extend the knowledge previously obtained with other approaches. A series of exploratory case studies are in place to investigate the modeling and predictive capabilities of the tool. A number of simulations have revealed distinctive local patterns of urban change in the city of Skopje, shaped by local urban spatial and institutional structures. This study shows the importance of intelligent technologies in the interpretation of the historical evidence of urban development.

    Ilija Ilievski, Sonja Gievska and Ognen Marina, ICT Innovations.

    PDF       BibTex

  • 2013
    Discovering Patterns of Urban Development in Skopje
    This research attempts to test a series of hypotheses on how the mechanisms of social and economic stratification have manifested in urban space and whether population dynamics has reconfigured the spatiality of the city landscape. This paper reflects upon the challenges surrounding the efforts in recognizing and interpreting the patterns of urban change. Our efforts are directed towards correlating real-word emergent patterns of urban change to contextual knowledge and incorporating them into model's predicting capabilities. Suitability of an extensive set of machine learning algorithms for simulation and prediction of urban development is investigated. Our long-term goal has been to lay out a foundation in terms of a knowledge base and a tool that will accommodate future exploration of different research scenarios related to urban dynamics. The specific goal of this research is to develop a tool that employs intelligent technologies to capture the patterns of urban change driven by a diverse set of context factors. Data mining provides opportunities, which complement and extend the knowledge previously obtained with other approaches.

    Sonja Gievska, Ognen Marina and Ilija Ilievski, 9th Congress "Virtual City and Territory".

    PDF       BibTex

Contact

Email: ilija.ilievski@u.nus.edu

Address: Vision and Machine Learning Lab, E4-#08-27, 4 Engineering Drive 3, National University of Singapore, Singapore 117583

Modified: 15 February 2017