BACK TO ALL POSTS

Shinobu: Planner app for optimised orders in mandarake.co.jp

April 5th, 2020 • 18 min read

Thumbnail
https://shinobu.noobsaigon.com

Background

If you have read my previous Case Study about how I created a personal manga database, you would have known that I am a big fan of the anime/manga culture and currently pursuing a hobby of collecting raw manga books. As I am currently based in Singapore, the only way for me to acquire these raw manga books is by ordering them online through Japanese-based book sites. Among all of these sites, Mandarake is usually my go-to option if possible.

Mandarake
https://www.mandarake.co.jp/

Mandarake is a famous chain store selling second-hand products derived from anime/manga culture. Their stores are spread across Japan and all of the products from each store are available to be purchased online. Mandarake is a great source for finding cheap manga, though I find them particularly useful for getting older, out-of-print series that you cannot get anywhere else (since no official retailer stocks them anymore).

The only problem with Mandarake is that since all of their products are user-driven due to the second-hand nature, each product is only available in certain stores and whenever you order a product, you need to indicate which store you want to purchase the product from as their prices and availability might vary. So in order to purchase a list of books, I might need to order them from multiple different branches of Mandarake. Unfortunately, Mandarake does not offer combining all orders from different stores together, but instead will treat them as unique orders with separate deliveries. Hence, if we do not plan our orders carefully, we might unnecessarily order from more stores than we actually need to, which will result in redundantly paying for extra delivery fees. 😔

Mandarake Cart
Mandarake Checkout Page

As usual, this inspired me to quickly come up with a planner app to solve this problem. Given a list of books that I want to purchase, this planner app will need to provide the most optimal set of orders from Mandarake for me. This article will shed some light on how the app was created and my learnings throughout the process.

Technology Stack

Since this project is relatively small and only for my personal use, I decided to go with the easiest setup by using create-react-app. This would allow me to focus primarily on implementing the logic of planning orders without worrying about all of the build configurations for the web app.

Hence, my core tech stack will include:

  • React
  • create-react-app
  • Typescript
  • Netlify

Some other cool tech/libraries besides the core stack will be elaborated further down the line to make it more exciting. 🤩

Home page

UI

Home Page
Home Page

The home page is where the users will land on when they first access the website. It is also the page where users will spend most of their time in. There are 3 major components inside the page:

  • Header: The header contains the input text where users can paste a Mandarake listing URL so that the app will scrape the information of the listing and store them in the cart. Apart from the input text, the header will also display the average total price of all listings in the cart so far.
  • Body: This is where the content of the cart will be displayed. Each listing will be represented as a card showing its average price among the stores and the list of stores selling it. Each store name inside the card is an actual link to the page of the corresponding listing under that store.
  • Action Buttons: Having fixed positions at the bottom right corner of the application are two main action buttons of the application:

    • Refresh Button: Clicking this will make the application scrape for information of all current listings in the cart again to acquire the latest statuses of these listings.
    • Checkout Button: Clicking this will make the application analyze the current cart and compute the optimal set of orders for it. Then, it will bring the users to the Checkout page where the set of orders will be displayed.

The UI layout of this page can be easily implemented with React. Furthermore, if you have read my previous Case Study, the UI layout of this home page also looks very similar to the CMS interface of my manga database. Hence, we will not dive deeper into how the UI for this page was constructed. Instead, we will look into how the data source of the page i.e the listings cart was formed.

Data Source

By observing the UI above, for each book, these are the data we need:

  • The list of stores selling the book. For each store, we need to know the store name, the price of the listing inside this store and the URL of the book in this specific store.
  • The thumbnail information of the book i.e URL, width and height.
  • The timestamp when the book information was last scraped. This is necessary for the application to decide whether it should scrape for the book's information again when the users click the Refresh button.
export type BookSource = {
  price: number;
  storeName: string;
  url: string;
};

export type Book = {
  id: string;
  thumbnailURL: string;
  thumbnailWidth: number;
  thumbnailHeight: number;
  sources: BookSource[];
  lastUpdatedAt: number;
};

For each card, we need to display the average price of the book among all stores selling it. Then, we will also need to display the total of these average prices in the header as the approximate cost of the cart. These numbers can be easily computed with some utility functions.

export function getAveragePrice(book: Book): number {
  const averagePrice = Math.floor(
    book.sources.reduce(
      (acc: number, source: BookSource) => acc + source.price,
      0,
    ) / book.sources.length,
  );

  return averagePrice;
}

export function getTotalPrice(books: Book[]): number {
  const total = books.reduce(
    (acc: number, book: Book) => acc + getAveragePrice(book)
  );

  return Math.ceil(total);
}

Scraper

So after the data source is defined, the next thing we will look at is how the data can be created. As indicated in the UI section, to add a book to the cart, users will need to pass the book URL to the input text in the header where in turn the app will scrape for the book information and store it inside the cart.

The scraper will need to run on a non-browser environment since the CORS (Cross-Origin Resource Sharing) settings on Mandarake will prevent the browser from accessing the page content. In the last Case Study, we have already created a scraper API endpoint for multiple sites with Netlify Functions. For this use case, we will adopt the same approach but focus solely on Mandarake where we will scrape extra information for each book as defined in the data shape from previous section.

const got = require("got");
const cheerio = require("cheerio");

async function scrape(bookURL) {
  const { body: htmlString } = await got(bookURL);
  const $ = cheerio.load(htmlString);

  // Scrape images
  const imageURLs = [];

  // When there are multiple preview images
  if ($(".xzoom-thumbs img").length > 0) {
    $(".xzoom-thumbs img").each(function() {
      imageURLs.push($(this).attr("src"));
    });
  } else {
    imageURLs.push($(".pic img").attr("src"));
  }

  // Scrape current store's name
  const storeName = $(".__shop")
    .eq(0)
    .text()
    .trim();

  // Scrape current store's price
  const price =
    $(".soldout").length === 0
      ? parseInt(
          $(".__price")
            .eq(0)
            .text()
            .trim()
            .replace(/,/g, ""),
          10,
        )
      : null;

  // If price is not nil, the book is not sold out in the current store
  if (price != null) {
    sources.push({
      price,
      storeName,
      url: bookURL,
    });
  }

  // Scrape other store's names and their corresponding prices
  $(".other_itemlist .block").each(function() {
    const url = $(this)
      .find(".title a")
      .attr("href");
    if (url.startsWith("#")) {
      return;
    }

    const storeName = $(this)
      .find(".shop")
      .text()
      .trim();
    const price = parseInt(
      $(this)
        .find(".price")
        .text()
        .trim()
        .replace(/,/g, ""),
      10,
    );
    sources.push({
      price,
      storeName,
      url: `https://order.mandarake.co.jp${url}`,
    });
  });

  return {
    thumbnailURL: imageURLs[0],
    sources,
  };
}
const ALLOWED_HOSTNAMES = ["order.mandarake.co.jp"];

exports.handler = async (event, context) => {
  const { httpMethod, queryStringParameters } = event;

  // Only allow GET
  if (httpMethod !== "GET") {
    return { statusCode: 405, body: "Method Not Allowed" };
  }

  // Missing required query `url`
  const { bookURL } = queryStringParameters;
  if (bookURL == null) {
    return { statusCode: 400, body: "Missing query `bookURL`" };
  }

  try {
    const parsedBookURL = new URL(bookURL);
    if (!ALLOWED_HOSTNAMES.includes(parsedBookURL.hostname)) {
      return { statusCode: 400, body: "Book URL is not supported" };
    }
    const scrapedData = await scrape(bookURL);

    return {
      statusCode: 200,
      body: JSON.stringify({
        id: bookURL,
        ...scrapedData,
      }),
      headers: {
        "Content-Type": "application/json",
      },
    };
  } catch (error) {
    return {
      statusCode: 500,
      body: "Something wrong happened",
    };
  }
};

Local Storage

So the next question after we successfully scrape the data of the book is where should we store the shopping cart data?

A straightforward way is to just put the cart data under a state of the React component using useState hook.

import { useState } from "react";

const App = () => {
  const [currentBooks, setCurrentBooks] = useState<Book[]>([]);

  // ...
}

However, a problem with this approach is that the cart data will not be persisted as it will be refreshed whenever the browser session ends. Hence, a better way to store the shopping cart is to save it under the Local Storage of the browser:

  • Everytime there's an update in the cart, the corresponding field for the cart inside the local storage will also be updated accordingly.
  • At the beginning of the application, the app component will read the data from the local storage to populate the cart.

We can create a utility hook so that our React component can interact with the local storage more easily:

import { useState } from "react";

export default function useLocalStorage<T>(
  key: string, // key inside the local storage
  initialValue: T, // initial value if key does not exist
): [T, (value: T) => void] {
  const [storedValue, setStoredValue] = useState<T>(
    (): T => {
      try {
        const item = window.localStorage.getItem(key);
        return item != null
          ? JSON.parse(item)
          : initialValue;
      } catch (error) {
        console.error(error);
        return initialValue;
      }
    },
  );

  const setValue = (value: T): void => {
    try {
      const valueToStore =
        value instanceof Function
          ? value(storedValue)
          : value;
      setStoredValue(valueToStore);
      window.localStorage.setItem(
        key,
        JSON.stringify(valueToStore),
      );
    } catch (error) {
      console.error(error);
    }
  };

  return [storedValue, setValue];
}
import useLocalStorage from "../hooks/useLocalStorage";

const App = () => {
  const [currentBooks, setCurrentBooks] = useLocalStorage<Book[]>(
    "books",
    [],
  );

  // ...
};

With this utility hook, our cart data is successfully linked to the local storage and users will always be able to see their latest shopping cart whenever they revisit the site. 👏

Checkout page

So after populating the shopping cart, the next step for the user is to check out and the responsibility for the app here is to provide the user the ideal set of orders with minimized cost. This set of orders will be displayed under the Checkout page.

UI

Checkout Page
Checkout Page

The UI of Checkout page is fairly similar to the UI of Home page. The main difference is that the cards are splitted into separate groups where each group represents a single order from a store.

For each group, the listing links of the group's store are highlighted with the corresponding listing price so that the users can easily click on the link to add to the actual Mandarake cart.

Similar to the Home page, the UI layout of this page can be easily implemented with React. Hence, instead of elaborating on how the UI for this page was created, we will look into how the data source of the page i.e the set of orders was computed. This is the root problem that our application was created to solve.

Partitioning Problem

So to recap the problem again, given a list of books from Mandarake where each book is only available in a limited number of Mandarake branches, we want to create an optimal set of orders where:

  • All books are purchased
  • The number of orders in the set is minimized since each order will incur a separate delivery fee.
  • Each order should have its total cost less than 25000円. This is necessary as Singapore will put a Goods and Services Tax (GST) of 7% to each imported package with its price exceeding 400SGD. Hence, by accounting for the delivery cost, the maximum price for each order should be 25000円 so that the delivery packaged will have GST relief granted. Because of this, there can be multiple orders from a single store so that we can avoid the GST.

Optimal Solution

The natural approach for this problem is to simulate all possible combinations of orders. Given each store order as a bucket, for each book, we will check whether putting the book in the i-th bucket will give us the best result i.e the number of empty buckets is maximized.

So before we can proceed with the simulation, we first need to create all the empty buckets for the books to be put. As each store can have multiple orders/buckets, we need to calculate the maximum number of buckets for a Mandarake branch by dividing the total cost of all available books from that store by the maximum cost per order (25000円).

const MAX_PRICE_PER_BUCKET = 25000;

function initializeBuckets(books: Book[]) {
  const storeMaxPriceMap: { [name: string]: number } = {};
  for (const book of books) {
    for (const { storeName, price } of book.sources) {
      if (storeMaxPriceMap[storeName] == null) {
        storeMaxPriceMap[storeName] = 0;
      }
      storeMaxPriceMap[storeName] += price;
    }
  }

  const bucketPrices: number[] = [];
  const bucketStores: string[] = [];
  const storeMaxPriceEntries = Object.entries(storeMaxPriceMap);
  for (const [storeName, maxPrice] of storeMaxPriceEntries) {
    const maxBuckets = Math.ceil(maxPrice / MAX_PRICE_PER_BUCKET);
    bucketPrices.concat([...Array(maxBuckets)].fill(0));
    bucketPrices.concat([...Array(maxBuckets)].fill(storeName));
  }

  return {
    bucketPrices,
    bucketStores,
  };
}

After initializing the buckets, for each book in the list, we can put it into one of the buckets that we have created. For each of these choices, we will recursively simulate with one less book in the list. If we place all books successfully, we will check if the number of filled buckets is the smallest so far and record it down. Then, we will backtrack the recursion stack and continue to simulate all other possible combinations.

We can integrate some speedups to improve the duration of the search:

  • We will only put a book into those buckets/orders where the stores they are representing have the book in stock.
  • We will only put a book into those buckets/orders wheere the order cost afteward does not exceed 25000円.
  • At any point during the search, if the current simulation has more filled buckets than the minimum number so far, we will terminate this simulation early since further search in this branch will not give us an optimal solution anyway.
./src/utils/optimalBooksPartition.ts
const MAX_PRICE_PER_BUCKET = 25000;

type Dict<T> = { [key: string]: T };

export function partitionBooks(books: Book[]): Dict<Book[][]> {
  const bookSourceMapList: Dict<BookSource>[] = books.map(book => {
    return book.sources.reduce((map: Dict<BookSource>, source) => {
      map[source.storeName] = source;
      return map;
    }, {});
  });
  const { bucketPrices, bucketStores } = initializeBuckets(books);
  const bucketBooksList: Book[][] = bucketPrices.map(() => []);

  let bestBucketBooksList: Book[][] = [];
  let bestBucketCount = Infinity;

  function backtrack(bookIndex: number, bucketFilled: number): void {
    if (bucketFilled > bestBucketCount) {
      return;
    }

    // Finish packing all the books
    if (bookIndex === books.length) {
      if (bucketFilled < bestBucketCount) {
        bestBucketBooksList = JSON.parse(
          JSON.stringify(bucketBooksList),
        );
        bestBucketCount = bucketFilled;
      }
      return;
    }

    const currentBook = books[bookIndex];
    for (
      let bucketIndex = 0;
      bucketIndex < bucketPrices.length;
      bucketIndex++
    ) {
      const storeName = bucketStores[bucketIndex];
      const bucketPrice = bucketPrices[bucketIndex];
      const bookSource = bookSourceMapList[bookIndex][storeName];

      // Store does not have this book
      if (bookSource == null) {
        continue;
      }

      const { price } = bookSource;
      // Bucket price exceeds the limit if this book is added
      if (bucketPrice + price > MAX_PRICE_PER_BUCKET) {
        continue;
      }

      const isEmpty = bucketPrice === 0;
      bucketPrices[bucketIndex] += price;
      bucketBooksList[bucketIndex].push(currentBook);
      backtrack(
        bookIndex + 1,
        isEmpty ? bucketFilled + 1 : bucketFilled,
      );
      bucketPrices[bucketIndex] -= price;
      bucketBooksList[bucketIndex].pop();
    }
  }

  backtrack(0, 0);

  const map: Dict<Book[][]> = {};
  for (const bucketIndex in bestBucketBooksList) {
    const storeName = bucketStores[bucketIndex];
    const bucketBooks = bestBucketBooksList[bucketIndex];

    if (bucketBooks.length > 0) {
      if (map[storeName] == null) {
        map[storeName] = [];
      }
      map[storeName].push(bucketBooks);
    }
  }

  return map;
}

Sub-optimal Solution

By analyzing the time complexity of the above solution, we can conclude that this partitioning problem is an NP-hard problem. Hence, the time taken for us to get the most optimal set of orders will increase exponentially as the number of books increases.

The solution above might be ideal as a long-running script in the local environment. However, for the browser environment of our web application, the possibily lengthy duration of this synchronous task is not really ideal as it will freeze the whole interface of the app due to Javascript's single-threaded nature.

An alternative approach here is to come up with a sub-optimal solution, which has a much better time complexity while provides a nearly optimal set of orders for most of the time.

For this solution, we will keep track of the list of books that we can purchase from each store. The list of books for each of the store will be sorted by increasing range of availability then decreasing price. For every iteration, we will choose the store with the highest total price of all the buyable books to create an order. We will exhaustively add books from the sorted list of this store until it crosses the limit of 25000円. After every iteration, we will remove the recently added books to the new order from the sorted list of books from each store. We will keep doing this until all of the lists are empty. The time complexity of this solution is linear to the number of books that we want to purchase.

./src/utils/subOptimalBooksPartition.ts
const MAX_PRICE_PER_BUCKET = 25000;

type Dict<T> = { [key: string]: T };

export function partitionBooks(books: Book[]): Dict<Book[][]> {
  // Populate the map of sources for each book
  const bookSourcesMap: Dict<Dict<BookSource>> = {};
  for (const book of books) {
    bookSourcesMap[book.id] = {};
    for (const source of book.sources) {
      bookSourcesMap[book.id][source.storeName] = source;
    }
  }

  // Populate the list of books for each store
  const storeBooksMap: Dict<{ total: number; books: Book[] }> = {};
  for (const book of books) {
    for (const source of book.sources) {
      if (storeBooksMap[source.storeName] == null) {
        storeBooksMap[source.storeName] = {
          total: 0,
          books: [],
        };
      }
      storeBooksMap[source.storeName].total += source.price;
      storeBooksMap[source.storeName].books.push(book);
    }
  }

  // Sort store entries by their total price
  const stores = Object.entries(storeBooksMap);
  stores.sort(
    ([, { total: totalA }], [, { total: totalB }]) => totalA - totalB,
  );
  for (const [storeName, { books: storeBooks }] of stores) {
    // Sort books by their range of availability
    // Book that exists in the fewest stores comes first
    storeBooks.sort((bookA, bookB) => {
      if (bookA.sources.length === bookB.sources.length) {
        return (
          bookSourcesMap[bookA.id][storeName].price -
          bookSourcesMap[bookB.id][storeName].price
        );
      } else {
        return bookA.sources.length - bookB.sources.length;
      }
    });
  }

  const visited: Dict<boolean> = {};
  const map: Dict<Book[][]> = {};

  while (Object.keys(visited).length < books.length) {
    // Get store with currently top total price
    const [storeName, { books: storeBooks }] = stores[
      stores.length - 1
    ];

    if (map[storeName] == null) {
      map[storeName] = [];
    }

    let price = 0;
    const bucketBooks: Book[] = [];
    for (const book of storeBooks) {
      // If book has already been put into an order, skip it
      if (visited[book.id]) {
        continue;
      }
      // Skip the book if adding it will exceed the maximum price for an order
      const bookPrice = bookSourcesMap[book.id][storeName].price;
      if (price + bookPrice > MAX_PRICE_PER_BUCKET) {
        continue;
      }

      price += bookPrice;
      bucketBooks.push(book);
      visited[book.id] = true;
    }

    // If order is not empty, add it to the list of orders for this store
    if (bucketBooks.length > 0) {
      map[storeName].push(bucketBooks);
    }

    // Update the list of books and the total price of each store
    // by filtering out all recently added books in the new order
    for (const [storeName, storeInfo] of stores) {
      storeInfo.books = storeInfo.books
        .map(book => {
          const bookPrice = bookSourcesMap[book.id][storeName].price;
          if (visited[book.id]) {
            storeInfo.total -= bookPrice;
            return null;
          }

          return book;
        })
        .filter((book): book is Book => book != null);
    }

    // Sort store entries by their newly updated total price
    stores.sort(
      ([, { total: totalA }], [, { total: totalB }]) =>
        totalA - totalB,
    );
  }

  return map;
}

Web Worker

Although we have already managed to come up with the sub-optimal solution, under normal cases where the list of books to be purchased is not really large, the optimal solution is still the preferrable way. So how can we run these 2 solutions in parallel and only choose the sub-optimal result if the optimal solution takes too long (more than 5 seconds) to finish?

As both solutions are synchronous functions, running them concurrently in UI thread of browser is not possible. This is where Web Worker will help us to achieve this.

Web Workers makes it possible to run a script operation in a background thread separate from the main execution thread of a web application. The advantage of this is that laborious processing can be performed in a separate thread, allowing the main (usually the UI) thread to run without being blocked/slowed down.

Hence, by leveraging on Web Workers, we can run the two solutions above in parallel. 🎉 An easy way to adopt web worker is to use the Webpack loader workerize-loader, which will make calling all exported functions from the required module run in the background thread instead of the main execution thread of our web application. Hence, we only need to update how we call the partition functions in the event handler of clicking Checkout button from Home page.

import { useState } from "react";

import useLocalStorage from "../hooks/useLocalStorage";

const optimalWorker = require(
  "workerize-loader!../utils/optimalBooksPartition"
)();
const subOptimalWorker = require(
  "workerize-loader!../worker/subOptimalBooksPartition"
)();

const timeOut = (duration: number): Promise<void> => {
  return new Promise(resolve => {
    setTimeout(resolve, duration);
  });
};

const App = () => {
  const [currentBooks, setCurrentBooks] = useLocalStorage<Book[]>(
    "books",
    [],
  );
  const [
    partitionedBooks,
    setPartitionedBooks
  ] = useState<Dict<Book[][]> | null>(null);

  const onCheckoutButtonClick = async (): Promise<void> => {
    const [
      optimalPartitionedBooks,
      subOptimalPartitionedBooks,
    ] = await Promise.all([
      Promise.race([
        timeOut(5000),
        optimalWorker.partitionBooks(currentBooks),
      ]),
      subOptimalWorker.partitionBooks(currentBooks),
    ]);
    setPartitionedBooks(
      optimalPartitionedBooks || subOptimalPartitionedBooks,
    );
  };

  // ...
};

With this, both of our solutions will be run in the background thread where the optimal solution will be ignored if it takes longer than 5 seconds. 🙌

Final Thoughts

This concluded the case study of my order planner app. Thanks to the app, I have had a much easier time planning my orders from Mandarake with cost minimized as much as possible. Throughout the process of creating the application, I have also learnt a lot of new techniques and reinforce my knowledge on various aspects of web development. By solving the problem, it also made us realize that sometimes we can sacrifice the correctness for the speed and finding the balance between them is really important. The full source code of the app can be found under:

https://github.com/imouto1994/shinobu

At this point, the app is still far from being done and I have plans to add more helpful features to the app in the future such as:

  • Allow users to login and create Mandarake orders directly through the app
  • Support planning for different countries instead of only Singapore

But until the next update comes, see you when I see you! ✌️