Data Engineer
Role and Resposibilities :-
Develop programs in Scala, Java and Python as part of data cleaning and processing.
Conduct complex data analysis and report on results
Responsible to design and develop distributed, high-volume, high-velocity multi-threaded event processing systems using Core Java technology stack.
Prepare data for prescriptive and predictive modeling
Develop efficient software code for multiple use cases leveraging Core Java and Big Data technologies for various use cases built on the platform.
Provide high operational excellence guaranteeing high availability and platform stability
Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Hadoop, Cloud computing etc.
Assist in designing, enhancing, and constructing new and existing features using performant, scalable, secure, documented, and maintainable code
Quickly produce well-organized, optimized, well-tested, and documented code
Meet deadlines and satisfy requirements.
Participate in agile ceremonies such as iteration planning, retrospective, and daily stand-ups
Their role typically falls into one of three categories: generalist, pipeline-centric, and database-centric.
Managing, optimizing, overseeing, and monitoring data retrieval, storage, and distribution.
Relevant experience with an emphasis on data integration in structured and unstructured formats such as flat files, JSON, and XML
Strong knowledge of OOP and reusable coding practices
Maintaining a data warehouse and analytics environment, and writing scripts for data integration and analysis
Process the data with Hive, Hadoop, and Spark.
Working knowledge of Python, serverless environments, microservices, and RESTful/SOAP APIs is a plus
Experienced using practices such as unit testing, integration testing, proper code documentation, and appropriate logging to develop highly maintainable and reliable code
Collaborate with other Engineers and Operations to design and implement features
Know how to build and maintain database systems, be fluent in programming languages such as SQL, Python, and R, be adept at finding warehousing solutions, and using ETL (Extract, Transfer, Load) tools, and understand basic machine learning and algorithms
Communicate effectively and efficiently with teammates and other departments
Develop reusable integrations, both inbound and outbound, between Cami.AI systems and other customer platforms.
Responsible to Ingest data from files, streams and databases.