Data Engineer
Qualifications:-
Bachelorâs /Master's degree in Computer Science, Computer Engineering, or related field/experience
Role and Resposibilities :-
Managing, optimizing, overseeing, and monitoring data retrieval, storage, and distribution.
Relevant experience with an emphasis on data integration in structured and unstructured formats such as flat-files, JSON, and XML
Strong knowledge of OOP and reusable coding practices
Working knowledge with Python, serverless environments, microservices, and RESTful/SOAP APIs is a plus
Experienced using practices such as unit testing, integration testing, proper code documentation, and appropriate logging to develop highly maintainable and reliable code
Collaborate with other Engineers and Operations to design and implement features
Communicate effectively and efficiently with teammates and other departments
Develop reusable integrations, both inbound and outbound, between Cami.AI systems and other customer platforms.
Responsible to Ingest data from files, streams and databases. Process the data with Hive, Hadoop, and Spark.
Develop programs in Scala, Java and Python as part of data cleaning and processing.
Responsible to design and develop distributed, high-volume, high-velocity multi-threaded event processing systems using Core Java technology stack.
Develop efficient software code for multiple use cases leveraging Core Java and Big Data technologies for various use cases built on the platform.
Provide high operational excellence guaranteeing high availability and platform stability
Implement scalable solutions to meet the ever-increasing data volumes, using big data/cloud technologies Apache Spark, Hadoop, Cloud computing etc.
Assist in designing, enhancing, and constructing new and existing features using performant, scalable, secure, documented, and maintainable code
Quickly produce well-organized, optimized, well-tested, and documented code
Meet deadlines and satisfy requirements. Participate in agile ceremonies such as iteration planning, retrospective, and daily stand-ups
Their role typically falls into one of three categories: generalist, pipeline-centric, and database-centric.
Skills Required:-
Hands-on experience on any of the above tools/technology is expected with PySpark being the must-to-have skill
Hands-on knowledge of ADF
Must have PySpark experience
Hands-on knowledge of the IICS tool
Hands-on Knowledge on Snowflake
Hands-on knowledge of Synapse