[Spring] N건 파일 업로드에 대한 병렬 처리 최적화

개선 포인트

 

GitHub - sjiwon/study-with-me-be: 여기서 구해볼래? Backend Repository (Refactoring)

여기서 구해볼래? Backend Repository (Refactoring). Contribute to sjiwon/study-with-me-be development by creating an account on GitHub.

github.com

스터디 모집부터 진행 관리까지 하는 Study With Me 웹 애플리케이션에는 여러가지 기능이 있지만 본 포스팅에서 개선할 기능은 스터디 주차 생성이다

 

@RestController
@RequiredArgsConstructor
@RequestMapping("/api/studies/{studyId}/weeks")
public class StudyWeeklyApiController {
    private final CreateStudyWeeklyUseCase createStudyWeeklyUseCase;
    ...
 
    @CheckStudyHost
    @PostMapping(consumes = MULTIPART_FORM_DATA_VALUE)
    public ResponseEntity<StudyWeeklyIdResponse> createWeekly(
            @Auth final Authenticated authenticated,
            @PathVariable final Long studyId,
            @ModelAttribute @Valid final CreateStudyWeeklyRequest request
    ) {
        final Long weeklyId = createStudyWeeklyUseCase.invoke(new CreateStudyWeeklyCommand(
                studyId,
                authenticated.id(),
                request.title(),
                request.content(),
                new Period(request.startDate(), request.endDate()),
                request.assignmentExists(),
                request.autoAttendance(),
                FileConverter.convertAttachmentFiles(request.files())
        ));
        return ResponseEntity.ok(new StudyWeeklyIdResponse(weeklyId));
    }
    
    ...
}

@Service
@RequiredArgsConstructor
public class CreateStudyWeeklyUseCase {
    private final AttachmentUploader attachmentUploader;
    private final StudyWeeklyRepository studyWeeklyRepository;
    private final WeeklyCreator weeklyCreator;
 
    public Long invoke(final CreateStudyWeeklyCommand command) {
        final List<UploadAttachment> attachments = attachmentUploader.uploadAttachments(command.attachments());
        final int nextWeek = studyWeeklyRepository.getNextWeek(command.studyId());
 
        return weeklyCreator.invoke(command, attachments, nextWeek).getId();
    }
}

@Component
@RequiredArgsConstructor
public class AttachmentUploader {
    private final FileUploader fileUploader;
 
    public List<UploadAttachment> uploadAttachments(final List<RawFileData> files) {
        if (CollectionUtils.isEmpty(files)) {
            return List.of();
        }
 
        return files.stream()
                .map(file -> new UploadAttachment(file.fileName(), fileUploader.uploadFile(file)))
                .toList();
    }
}

스터디 주차 생성 시 주차 관련 첨부파일을 N건 등록해서 생성할 수 있다

 

[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Algorithm.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@51f717ae]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=759ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=AWS.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@9db7d00]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=858ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Baekjoon.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@db9cf7f]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1010ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Data Structure.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@2ba20f8a]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1151ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Database.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@42013737]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1221ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Design Pattern.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@1060710a]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1307ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Docker.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@5361864e]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1449ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Java.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@68041df7]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1531ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=jOOQ.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@23f7274f]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1609ms 
[http-nio-8080-exec-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=JPA.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@300cea1a]] 
[http-nio-8080-exec-3] |<---S3FileUploader.uploadFile(..) time=1684ms

 

당연히 정상 동작은 하겠지만 약간 거슬리는 부분이 보인다

현재 구현된 N건 파일 업로드의 프로세스는 다음과 같이 진행된다

이처럼 N건의 파일들이 순차적으로 업로드 되고 있고 이는 굉장히 단순한 방식이기도 하다
위의 쓰레드 로그를 보면 단일 쓰레드가 모든 업로드 작업을 처리하고 있다

이러한 단일 쓰레드 업로드 방식은 이미지별로 업로드되는 시간이 순차적으로 누적되어서 전체 업로드 시간이 그만큼 길어지고 결론적으로 관련 API의 성능이 굉장히 저하된다

단건별로 대략 200ms정도 걸린다면

  • 5개 = 1000ms
  • 10개 = 2000ms
  • 20개 = 4000ms
  • ...

그리고 N건의 파일 업로드의 경우 각각의 업로드간에 순서를 지킬 필요가 없기 때문에 더더욱 병렬 처리가 효과적이라고 판단된다

 

1. Stream API's parallelStream()

CPU Core를 최대한 활용해서 병렬로 N건의 이미지를 분산시켜서 업로드하자

 

단순 stream() 계산 vs CPU Core를 최대로 활용하는 병렬 stream() 계산의 성능 차이를 비교해보자

private val cpuCores: Int = Runtime.getRuntime().availableProcessors()
private val data: List<Int> = (1..cpuCores).toList()

fun main() {
    println("CPU Core = $cpuCores\n")

    execute("Stream") { stream() }
    execute("ParallelStream") { parallelStream() }
}

private fun stream() {
    val sum: Long = data.sumOf {
        Thread.sleep(300)
        (it * it).toLong()
    }
    println("Sum = $sum")
}

private fun parallelStream() {
    val sum = data.parallelStream()
        .mapToLong {
            Thread.sleep(300)
            (it * it).toLong()
        }
        .sum()
    println("Sum = $sum")
}

private fun execute(
    title: String,
    logic: () -> Unit,
) {
    println("## With $title ##")

    val start = System.currentTimeMillis()
    logic()
    val end = System.currentTimeMillis()

    println("Execute = ${end - start}ms\n")
}

이처럼 간단한 계산임에도 불구하고 CPU Core를 극한으로 활용하는 병렬 연산이 더 효과적으로 진행됨을 알 수 있다

이제 실제 이미지 업로드에 병렬 처리를 적용해보자

@Component
@RequiredArgsConstructor
public class AttachmentUploader {
    private final FileUploader fileUploader;

    public List<UploadAttachment> uploadAttachments(final List<RawFileData> files) {
        if (CollectionUtils.isEmpty(files)) {
            return List.of();
        }

        return files.stream()
                .parallel() // 병렬 스트림
                .map(file -> new UploadAttachment(file.fileName(), fileUploader.uploadFile(file)))
                .toList();
    }
}
[http-nio-8080-exec-7] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=JPA.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@38092426]] 
[ForkJoinPool.commonPool-worker-1] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Database.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@4b93bb48]] 
[ForkJoinPool.commonPool-worker-2] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=AWS.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@7d6e1baf]] 
[ForkJoinPool.commonPool-worker-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Algorithm.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@6026c434]] 
[ForkJoinPool.commonPool-worker-4] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Docker.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@4895390b]] 
[ForkJoinPool.commonPool-worker-5] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Baekjoon.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@6adc9d5d]] 
[ForkJoinPool.commonPool-worker-6] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Data Structure.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@1dc2dba0]] 
[ForkJoinPool.commonPool-worker-7] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Java.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@4381e462]] 
[ForkJoinPool.commonPool-worker-5] |<---S3FileUploader.uploadFile(..) time=561ms 
[ForkJoinPool.commonPool-worker-1] |<---S3FileUploader.uploadFile(..) time=561ms 
[http-nio-8080-exec-7] |<---S3FileUploader.uploadFile(..) time=846ms 
[ForkJoinPool.commonPool-worker-5] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Design Pattern.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@46ad62f6]] 
[ForkJoinPool.commonPool-worker-1] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=jOOQ.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@151e4ea1]] 
[ForkJoinPool.commonPool-worker-7] |<---S3FileUploader.uploadFile(..) time=564ms 
[ForkJoinPool.commonPool-worker-2] |<---S3FileUploader.uploadFile(..) time=568ms 
[ForkJoinPool.commonPool-worker-3] |<---S3FileUploader.uploadFile(..) time=570ms 
[ForkJoinPool.commonPool-worker-4] |<---S3FileUploader.uploadFile(..) time=597ms 
[ForkJoinPool.commonPool-worker-6] |<---S3FileUploader.uploadFile(..) time=600ms 
[ForkJoinPool.commonPool-worker-5] |<---S3FileUploader.uploadFile(..) time=74ms 
[ForkJoinPool.commonPool-worker-1] |<---S3FileUploader.uploadFile(..) time=637ms

N건의 이미지 업로드가 병렬 처리됨을 확인할 수 있고 이미지 업로드 시간도 1684ms -> 637ms로 개선되었다

  • parallelStream() & ForkJoinPool의 메커니즘으로 인해 N건이 한번에 병렬 처리되지는 않는것을 알 수 있다

 

※ ForkJoinPool

parallelStream()에서 활용되는 ThreadPool은 ForkJoinPool이다

Fork

  1. Task를 더 작은 Task(Sub Task)로 분할하기
  2. Sub Task도 Recursive하게 더이상 분할할 수 없을때까지 분할 반복
  3. 끝까지 분할했으면 이제 분할한 작업 진행

Join

  1. 분할한 Task를 병렬로 처리한 후 다시 합치기
  2. 기존 Task가 될때까지 Recursive하게 합치기

 

ForkJoinPool에서 Task들을 처리하는 각각의 Thread들은 자신만의 Work Queue를 가지고 있다

여기서 만약 Work Queue에 더이상 진행할 Task가 없다면 Work Stealing 메커니즘을 기반으로 다른 Thread의 Work Queue에서 대기하고 있는 SubTask를 훔쳐와서 실행한다

  • 이러한 극한의 Work Stealing을 통해서 Thread는 Idle Status에 빠지지 않고 지속적으로 작업을 수행할 수 있다

 

CPU Bound vs I/O Bound

 

Scaling Up IO Tasks in Java

Though IO operations are painfully slow it is still possible to efficiently scale up an application which deals with a lot of IO

medium.com

 

Java - ForkJoinPool vs ExecutorService - GeeksforGeeks

A Computer Science portal for geeks. It contains well written, well thought and well explained computer science and programming articles, quizzes and practice/competitive programming/company interview Questions.

www.geeksforgeeks.org

현재 우리의 목표는 N건의 이미지 업로드 속도를 개선하는 것이고 이러한 작업은 I/O Bound Task이다


위에서 살펴봤듯이 parallelStream()을 통해서 병렬 처리를 진행하면 내부적으로 ForkJoinPool이라는 ThreadPool을 사용하게 된다
ForkJoinPool은 CPU 코어수를 기반으로 동작하는 ThreadPool이고 parallelStream()은 별도의 설정이 없다면 ForkJoinPool.commonPool()을 공유해서 사용한다

 

parallelStream()의 ThreadPool 크기를 조절하는 방법은 크게 2가지 존재한다

  1. java.util.concurrent.ForkJoinPool.common.parallelism 설정값 적용 (System.setProperty)
  2. 직접 ForKJoinPool 정의해서 병렬 Task submit하기


하지만 이러한 설정은 개별 설정이 아니라 ForkJoinPool에 대한 전역적인 설정이다


정리해보면 ForkJoinPool과 같은 CPU Core 기반의 ThreadPool을 사용하게 되면 I/O Bound Task에 대해서는 비효율적으로 동작할 수 있다

 

CPU Bound Task with ForkJoinPool

  • I/O 작업이 거의 없어서 Context Switching이 거의 발생하지 않고 오버헤드가 최소화되고 CPU 리소스를 극한으로 독점하면서 활용할 수 있어서 효율적이다

I/O Bound Task with ForkJoinPool

  • I/O Bound Task의 경우 Thread에 대한 빈번한 Context Switching이 발생할 수 있고 적절하지 않은 많은 Thread를 관리하고 있다면 Context Switching에 의한 오버헤드로 인해 리소스나 성능적인 측면에서 손해를 볼 수 있다

 

따라서 우리의 목표인 I/O Bound Task의 경우 CPU 코어수를 기반으로 동작하는 ForkJoinPool은 비효율적으로 동작할 수 있고 적합하지 않은 ThreadPool이다

 

 

2. CompletableFuture

CompletableFuture는 아래와 같은 이유로 현재 목표인 I/O Bound Task 성능 개선에 적합한 방식이라고 생각한다

  1. 비동기적 메커니즘
    • I/O 작업을 비동기적으로 실행하기 때문에 해당 작업으로 인해 Thread를 Blocking시키지 않는다
  2. 리소스 활용 최적화
    • parallelStream()의 경우 모든 병렬 스트림 작업은 공유된 ForkJoinPool에 의해 진행된다
      • 이 시점에 I/O 작업에 의한 Thread Blocking + Context Switching 오버헤드
    • 하지만 CompletableFutuer의 경우 Task별로 세밀한 ThreadPool을 설정할 수 있기 때문에 해당 작업에 적절한 Thread를 할당한다면 리소스 낭비를 줄일 수 있다

  • 물론 CompletableFuture의 경우에도 별도의 ThreadPool을 지정하지 않으면 ForkJoinPool을 사용한다
우리의 목표인 I/O Bound Task의 성능 개선을 위해서는 독립적인 ThreadPool을 구성해서 CompletableFuture에 적용해야 한다

 

@Bean(name = "fileUploadExecutor")
public Executor fileUploadExecutor() {
    final ThreadPoolTaskExecutor executor = new ThreadPoolTaskExecutor();
    executor.setCorePoolSize(10);
    executor.setMaxPoolSize(40);
    executor.setQueueCapacity(50);
    executor.setRejectedExecutionHandler(new ThreadPoolExecutor.CallerRunsPolicy());
    executor.setAwaitTerminationSeconds(60);
    executor.setTaskDecorator(new MdcTaskDecorator());
    executor.setThreadNamePrefix("Asynchronous File Upload Thread-");
    return executor;
}

@Component
public class AttachmentUploader {
    private final FileUploader fileUploader;
    private final Executor executor;

    public AttachmentUploader(
            final FileUploader fileUploader,
            @Qualifier("fileUploadExecutor") final Executor executor
    ) {
        this.fileUploader = fileUploader;
        this.executor = executor;
    }

    public List<UploadAttachment> uploadAttachments(final List<RawFileData> files) {
        if (CollectionUtils.isEmpty(files)) {
            return List.of();
        }

        final List<CompletableFuture<UploadAttachment>> result = files.stream()
                .map(file -> CompletableFuture.supplyAsync(
                        () -> new UploadAttachment(file.fileName(), fileUploader.uploadFile(file)),
                        executor
                ))
                .toList();

        return result.stream()
                .map(CompletableFuture::join)
                .toList();
    }
}
[Asynchronous File Upload Thread-9] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Algorithm.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@14fbf018]] 
[Asynchronous File Upload Thread-10] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Baekjoon.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@432a14a3]] 
[Asynchronous File Upload Thread-4] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Data Structure.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@174f3afd]] 
[Asynchronous File Upload Thread-5] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Database.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@5cff633c]] 
[Asynchronous File Upload Thread-3] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Design Pattern.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@2f5e473d]] 
[Asynchronous File Upload Thread-8] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=AWS.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@3cf9e563]] 
[Asynchronous File Upload Thread-1] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Docker.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@3c42c179]] 
[Asynchronous File Upload Thread-7] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=jOOQ.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@7575f327]] 
[Asynchronous File Upload Thread-2] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=Java.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@52349cde]] 
[Asynchronous File Upload Thread-6] |--->S3FileUploader.uploadFile(..) args=[RawFileData[fileName=JPA.png, contenType=image/png, extension=PNG, uploadType=STUDY_WEEKLY_ATTACHMENT, content=sun.nio.ch.ChannelInputStream@28cf251b]] 
[Asynchronous File Upload Thread-7] |<---S3FileUploader.uploadFile(..) time=23ms 
[Asynchronous File Upload Thread-3] |<---S3FileUploader.uploadFile(..) time=93ms 
[Asynchronous File Upload Thread-4] |<---S3FileUploader.uploadFile(..) time=93ms 
[Asynchronous File Upload Thread-6] |<---S3FileUploader.uploadFile(..) time=92ms 
[Asynchronous File Upload Thread-1] |<---S3FileUploader.uploadFile(..) time=93ms 
[Asynchronous File Upload Thread-10] |<---S3FileUploader.uploadFile(..) time=121ms 
[Asynchronous File Upload Thread-5] |<---S3FileUploader.uploadFile(..) time=134ms 
[Asynchronous File Upload Thread-2] |<---S3FileUploader.uploadFile(..) time=146ms 
[Asynchronous File Upload Thread-8] |<---S3FileUploader.uploadFile(..) time=146ms 
[Asynchronous File Upload Thread-9] |<---S3FileUploader.uploadFile(..) time=156ms