Applicant MUST be an Australian citizen or PR holder with more than 5 years of AU work experience.
The successful candidate(s) require an in-depth understanding of data-science concepts and issues with a working knowledge and experience with unstructured data (images and text). They will take the lead in solving complex data-science problems and issues, and guide and mentor others with these requirements.
Our Ideal Job Seeker provides advice to the organisation on data science issues including the following knowledge areas:
* Applies knowledge in mathematics, such as Linear Algebra and Calculus.
* Applies knowledge in statistics, such as Multivariate Statistics and Bayesian Probability Theory.
* Applies computer programming skills with a diverse range of languages such as R, Python, Scala, Java, C/C++, et cetera.
* Applies technical knowledge and experience in various data science areas such as artificial intelligence, machine learning, data mining, text mining, image processing, data visualisation, statistical modelling and behavioural analytics.
* Applies experience in data and system engineering, including data pipeline and end-to-end system engineering, to train and productionise analytical models.
* Applies experience in system administration, including assuring privacy, growing and shrinking cluster, installing hardware and software, and ensuring data properly replicated.
* Applies technical expertise to support our Big Data Analytics capability, such as the Hadoop Ecosystem.
Critically the Job Seeker will have:
* Practical Experience -demonstrable practical experience using Machine Learning and Artificial Intelligence techniques, and delivery of analytical solutions to business clients.
* Will be required to apply for a Baseline clearance (or already hold one).
* Language; A proficient and well-defined knowledge of SQL and, either R, or, Python statistical programming languages. Additionally, Java skills will be highly advantageous. Professional commitment and skills of maintaining code quality.
* Data; An ability to work with datasets of all types and formats specifically images and unstructured text. An understanding of data scales and sizes and related techniques, risks and impacts.
* Tools; Experience with Hadoop/Spark, Solr, Tika, Tesseract, Hive, Impala, TensorFlow, D3JS, R Shiny, Superset.
- An ability to produce business-ready output with key action items
- An understanding of the software development life cycle (SDLC) processes
- An awareness of deployment using Git and a CI pipeline