Python Software Engineer - Structured data and Data automation
About us
At RavenPack, we are at the forefront of developing the next generation of generative AI tools for the finance industry and beyond. With 20 years of experience as a leading big data analytics provider for financial services, we empower our clients—including some of the world's most successful hedge funds, banks, and asset managers—to enhance returns, reduce risk, and increase efficiency by integrating public information into their models and workflows. Building on this expertise, we are launching a new suite of GenAI and SaaS services, designed specifically for financial professionals.
Join a Company that is Powering the Future of Finance with AI
RavenPack has been recognized as the Best Alternative Data Provider by WatersTechnology and has been included in this year’s Top 100 Next Unicorns by Viva Technology. We have recently launched Bigdata.com, a next-generation platform aimed at transforming financial decision-making.
About the role
We are seeking a Python Software Engineer to join our Structured Data Squad, focusing on structured data processing and automation. The role centers on delivering end-to-end data applications, maintaining and improving existing data engineering pipelines, and integrating with APIs to ensure reliable data flow. You’ll also contribute to building AI agents which support tasks such as knowledge graph construction and unstructured data ingestion. This role requires collaboration with Data Scientists, Backend Engineers, and Product Managers to bring automation, scale, and intelligence to structured data workflows.
This role offers a hybrid work environment in our Marbella and Madrid offices.
Your Ramp-up Journey
In your first month, you will focus on getting accustomed to our systems, familiarizing yourself with the existing ML engineering projects, and beginning to take ownership of core tasks such as prompt evaluation and entity validation. You will work closely with your team to ship production systems and enhance our ability to manage both structured and unstructured data at scale, particularly in structured data onboarding and ingestion.
After six months, you will have successfully designed, implemented, and maintained ETL pipelines for structured data ingestion, ensuring high data quality across workflows. You will collaborate effectively with your team to enhance entity recognition systems and support the integration of ML/NLP components into larger workflows. Your ability to write production-grade Python code and ensure the scalability and performance of our systems will have significantly strengthened our data processing capabilities.
Key Responsibilities
● Design, implement, and maintain ETL pipelines for structured data ingestion, ensuring efficient data flow and processing.
● Utilize tools such as Spark, PySpark, Airflow, or Airbyte to automate workflows, enhancing operational efficiency.
● Integrate data from external providers and internal sources into production systems, guaranteeing seamless data access.
● Ensure data quality, consistency, and reliability across pipelines, thereby supporting data integrity.
● Implement prompt evaluation workflows to optimize extraction from unstructured data, increasing the effectiveness of our data models.
● Build and deploy applications that utilize ML or AI models, including transformers, in production environments.
● Support the integration of ML/NLP components into larger workflows, such as knowledge graph population and content enrichment.
● Experiment with fine-tuning models as required to improve performance and outcomes.
● Write production-grade Python code for data workflows and ML applications, ensuring high standards of software quality.
● Ensure scalability, performance, and maintainability of systems, contributing to long-term operational success.
● Work with orchestration and infrastructure tools such as Docker, Kubernetes, AWS, and CI/CD pipelines to enhance deployment processes.
● Collaborate with team members on the deployment and monitoring of applications, ensuring optimal functionality.
Qualifications
● Demonstrate strong Python development skills, particularly in the context of creating orchestration frameworks using Docker, AWS, and CI/CD pipelines for REST APIs and Lambda functions.
● Possess a hands-on ETL background using tools like Spark, PySpark, Airflow, or Airbyte, showcasing your ability to manage data workflows.
● Have proficiency with AI prompt/evaluation frameworks and experience building and maintaining agentic systems in production environments.
● Show familiarity with MCP tools and LLM-based automation workflows, enhancing your ability to contribute to our projects.
● Exhibit strong problem-solving skills and an ability to work with complex, multi-source datasets.
● Architect and implement serverless architectures for efficient resource utilization and scalability in cloud environments.
● Utilize cloud technologies like AWS to deploy and manage backend services, ensuring robust system performance.
● Design, develop, and maintain scalable backend services, microservices, data pipelines, and APIs to support our operational needs.
● Communicate effectively in English, both verbally and in writing, facilitating collaboration across teams.
Desirable/Nice to have
● Have experience training and fine-tuning transformer-based models, especially BERT and its variants, enhancing our AI capabilities.
● Possess exposure to Retrieval-Augmented Generation (RAG) systems, contributing to innovative data processing approaches.
● Show experience with statistical classification of structured/unstructured data, improving our analytical capabilities.
● Be familiar with financial datasets and domain-specific ontologies, enriching your contributions to our projects.
What´s in it for you?
● International Culture: With its headquarters in Marbella, Spain, and presence in Madrid, New York and London, RavenPack takes pride in being a truly diverse global organization.
● Competitive Salary: In RavenPack, we believe that your time and experience needs to be fairly rewarded.
● Continuous learning: We provide the support needed to grow within the team.
● Innovation: Innovation is the key to our success, so we encourage you to speak up and tell us about your vision.
● Relocation Assistance available for relocation into our Marbella office: Comprehensive relocation support is available to help you and your family move to the beautiful Costa del Sol.
● Marbella Shuttle bus: From Malaga, Fuengirola, La Riviera, and Estepona is available for free from the company.
We are an equal opportunity employer and value diversity at our company. We do not discriminate on the basis of race, religion, colour, national origin, gender, sexual orientation, age, marital status, veteran status, or disability status.
- Department
- Data
- Locations
- Marbella, Madrid
- Remote status
- Hybrid

About RavenPack
RavenPack is the leading big data analytics provider for financial services. Financial professionals rely on RavenPack for its speed and accuracy in analyzing large amounts of unstructured content. Our clients include the most successful hedge funds, banks, and asset managers in the world.
Already working at RavenPack?
Let’s recruit together and find your next colleague.