header_pic
Back to blog home

SQLAlchemy Query Related Posts

by Adam on 2020-04-29

This post relates to my answer to this SO question: https://stackoverflow.com/q/61493685/42346

Given a specific article, the OP wanted to query all other articles grouped by the number of tags in common. For example, from the following set:

Article1.tags = tag1,tag2,tag3,tag4
Article2.tags = tag1,tag3,tag5
Article3.tags = tag1,tag3,tag4,tag5

Given Article1 he wants the query to return:

Common Tags | Article
3             Article3 
2             Article2
So we can do it in the following manner. First, we have a bunch of imports:

import sqlalchemy, enum
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import (create_engine, Table, Integer, ForeignKey, 
    Column, String, Enum)
from sqlalchemy.orm import backref, relationship, sessionmaker 
from sqlalchemy.sql import func
from sqlalchemy import create_engine
from sqlalchemy.ext.hybrid import hybrid_property

And we declare our tables:

engine = create_engine('sqlite:///:memory:', echo=True)
Base = declarative_base()

article_tags = Table("article_tag", Base.metadata,
    Column('article_id', Integer, ForeignKey('articles.id')),
    Column('tag_id', Integer, ForeignKey('tags.id')))


class Tag(Base):
    __tablename__ = 'tags'
    id = Column(Integer,primary_key=True,index=True)
    tag = Column(String(64),unique=True,index=True)
    articles = relationship('Article',
                            secondary=article_tags,
                            back_populates='tags')


class Article(Base):
    __tablename__ = 'articles'
    id = Column(Integer, primary_key=True)
    article_title = Column(String(64))
    tags = relationship('Tag',
                        secondary=article_tags,
                        back_populates='articles')

Then we can insert our data:

Base.metadata.create_all(engine)
Session = sessionmaker(engine)
session = Session()

tag1 = Tag(tag='a')
tag2 = Tag(tag='b')
tag3 = Tag(tag='c')
tag4 = Tag(tag='d')
tag5 = Tag(tag='e')

article1 = Article(article_title='foo',tags=[tag1,tag2,tag3,tag4])
article2 = Article(article_title='bar',tags=[tag1,tag3,tag5])
article3 = Article(article_title='baz',tags=[tag1,tag3,tag4,tag5])

session.add_all([tag1,tag2,tag3,tag4,tag5,article1,article2,article3])
session.commit()          

# select just one article as an example
article_id = 1

This employs just one subquery:

sub_stmt = session.query(article_tags.c.tag_id)\
                  .filter(article_tags.c.article_id==article_id)

And here's the rest of the query:

session.query(Article.id,
              func.count(article_tags.c.tag_id).label('total'),
              func.group_concat(article_tags.c.tag_id).label('related_tags'))\
       .filter(Article.id!=article_id)\
       .filter(article_tags.c.tag_id.in_(sub_stmt))\
       .filter(article_tags.c.article_id==Article.id)\
       .group_by(Article.id)\
       .order_by(func.count(article_tags.c.tag_id).desc()).all()

I recently implemented this for my own blog using plain SQL:

        SELECT p.id, p.title, title_slug, 
            count(rt.id) related_tag_count, 
            string_agg(rt.title, ' ') related_tags, 
            string_agg(rt.slug, ' ') related_tag_slugs 
         FROM post p 
            JOIN post_tag rpt ON rpt.post_id = p.id
            JOIN tag rt ON rt.id = rpt.tag_id 
         WHERE p.id != %s
            AND rt.slug IN (
                SELECT t.slug 
                FROM 
                    post_tag pt 
                    JOIN tag t
                    ON t.id = pt.tag_id 
                WHERE 
                    pt.post_id = %s
             ) 
         GROUP BY p.id, p.title, title_slug 
         ORDER BY count(rt.id) DESC;

tags: sqlalchemy sql database many-to-many sqlite

Post thanked 0 time(s).

Related posts: